Mission

Why don’t we pay attention to our representatives between elections?

Legislative data is hard to parse, track, and organize. Activists, concerned citizens, and the curious may not have the time, resources, or expertise to build out duplicative tech stacks. Existing solutions may be limited by the willingness of organizations and companies to continue to run and host them - such as in the case of Google’s Civic Information API, which was shut down earlier this year. What would a decentralized, open-source legislative data solution look like?

The Govbot team’s goal is to bridge this gap - building the framework for the building and use of federated, open-source, non-profit legislative data. Built as a Chi Hack Night Breakout Group, the project includes an open-source, simplified, and expanded version of OpenStates’ data on state and federal legislation, as well as example applications.

What We Offer

The main Govbot dataset currently includes legislative updates from bills in the U.S. House & Senate, all 50 states, territories like Guam, and the city of Chicago, as .json files organized using the Project Open Data catalog format. The Govbot scrapers update regularly, appending new logs, and then running them through Claude to provide topic-based tagging and summaries. This data can then be analyzed using SQL, via an interface built with DuckDB, or plugged into applications like our example website, WindyCivi, and a test BlueSky bot built in collaboration with U.S. Representative Hoan Huynh. (https://bsky.app/profile/test-hoan-huynh.bsky.social).

How Do I Use It?

1. Install

sh -c "$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)"

2. Run govbot

govbot

That’s it. If no govbot.yml exists, an interactive wizard walks you through setup:

Sources - Choose all 47 states or pick specific ones
Tags - Start with an example tag, or get an AI prompt you can copy-paste to create your own
Publishing - RSS feeds configured automatically

The wizard creates govbot.yml, .gitignore, and a GitHub Actions workflow.

3. Run the pipeline

Once set up, running govbot again executes the full pipeline:

Clones/updates legislation repositories
Tags bills based on your tag definitions
Generates RSS feeds in the docs/ directory

Other Commands

govbot clone all           # download all state legislation datasets
govbot clone il ca ny      # download specific states
govbot logs                # stream legislative activity as JSON Lines
govbot logs | govbot tag   # process and tag data
govbot build               # generate RSS feeds
govbot load                # load bill metadata into DuckDB database
govbot delete all          # remove all downloaded data
govbot update              # update govbot to latest version
govbot --help              # see all commands and options

Dataset Key:

🆕: the locale’s data received updates since your last cloning
✅: the data you’ve cloned is up-to-date with the most current version
🔄: the data is currently being updated
❌: the data is not currently accessible

Querying in SQL using DuckDB

You can query the data using SQL, via DuckDB, which creates a simiulated database from the .json log files. See DUCKDB.md for more details.

Running Queries in the Command Line

-- Load JSON extension
INSTALL json;
LOAD json;

-- Query all bill metadata
SELECT * 
FROM read_json_auto('~/.govbot/repos/**/bills/*/metadata.json')
LIMIT 10;

Additional Commands, and Querying via the Web UI

Additional examples of commands, and setup for the web UI, can be found below:

# Load all data into a database (default: govbot.duckdb)
govbot load

# Or specify a custom database file
govbot load --database my-bills.duckdb

# With memory limit and thread settings
govbot load --memory-limit 32GB --threads 8

# Open in DuckDB UI (opens in your browser)
duckdb --ui govbot.duckdb

Helper Scripts

# Run example queries
./duckdb-query.sh examples/duckdb-example.sql

Contributing & Testing

Prerequisites

Folks looking to contirbute should have knowledge of Rust: just. just setup to start, and then just govbot ... to develop the cli.

The following should also be installed:

Rust & Cargo: Install the Rust Toolchain
Just: Install the task runner: cargo install just

Development Workflow

Use just govbot ... as your cli “dev” environment.

Other Useful Commands

just - See all available tasks
just test - Run all tests
just review - Review snapshot test changes
just mocks [LOCALES...] - Update mock data for testing

We build snapshots off examples. Add examples to make a test.

Advanced

GOVBOT_REPO_URL_TEMPLATE="https://gitsite.com/org/{locale}.git" govbot ...

Project History

The Govbot project began in 2022, with a vision to create a destination for simplified, summarized updates on legislative action, with the ability to follow or filter for certain legislative topics. The result was the initial Windy Civi app, and website, launched in beta in 2024.

While building the solution, the team began to consider the limitations of a centrally-managed data source and platform, versus one that could be decentralized, that was open-source, and that allowed for exploration and use of the data in ways beyond initial designs.

Our vision now has pivoted to building that data set, as well as building sample applications and solutions to ensure that government accountability can be accessible to all.

FAQs

Can I See The Repo?

Yes! Our main repo can be found here. The repo that is being used to run and store the data - the ‘toolkit’ repo - can be found here.

How Is The Data Structured?

You an find the file format structure and .json schema in the readme.md located here.

How Do I Clone This Data?

Each locale is scaped using a GitHub Actions tempate that is defined and explained in detail here. You can follow this template to create a new repository of locale data.

To help manage multiple pipelines or locales, look at our pipeline manager documentation

How Can I Stay Updated, Or Get In Touch?

You can stay updated by following our work at Chi Hack Night, as well as on the related Slack (see below). You can also follow our commits and updates on Github and this Docs page,

You can message us on the Chi Hack Night Slack - we have our own channel.

Let The People Take Back Their Government

Our 2024 efforts taught us a few things:

People don’t want to download another app or use another portal.
Getting bill data is really hard, and usually involves using private APIs.
Creating AI summaries and topics should be controlled by organizations/individuals.

As such, our effort has become 2-fold

Decentralize Government Data
Gov News Bot Builder

Contributors

2025 Deck – Democratic Infrastructure

marp: true theme: default paginate: true

Govbot

Federated, open-source legislative data for everyone

Overview

The Problem Our Solultion What We Offer Features Setup + Core Functions

The Problem

Why don’t we pay attention to our representatives between elections?

Legislative data is hard to parse, track, and organize. Activists, concerned citizens, and the curious may not have the time, resources, or expertise to build out duplicative tech stacks.

The Problem (cont.)

Existing solutions may be limited by the willingness of organizations and companies to continue to run and host them - such as in the case of Google’s Civic Information API, which was shut down earlier this year.

What would a decentralized, open-source legislative data solution look like?

Our Solution

The Govbot team’s goal is to bridge this gap - building the framework for federated, open-source, non-profit legislative data.

Built as a Chi Hack Night Breakout Group, this project offers frameworks and tools built on top of OpenStates’ data on state and federal legislation.

What We Offer

The main Govbot dataset currently includes legislative updates from:

the U.S. House & Senate
Legislatures from all 50 states
Legislatures from U.S. territories

Data is organized as .json files using the Project Open Data catalog format, scraped and appended regularly.

Features

A decentralized, regularly updating, legislative data catalog
AI-powered, topic-based tagging and summaries, customized using .yml
SQL querying via DuckDB interface
Example applications, like custom websites (see our demo WindyCivi site), and social media bots (see our BlueSky bot, made in collaboration with U.S. Representative Hoan Huynh)

Setup

You can download the setup script via one-line install, from our GitHub repository:

sh -c “$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)

Core Functions

Once installed, you can:

Clone the entire dataset
Clone specific items (state, session, or bill)
Load metadata into a SQL-accessible DuckDB database

Project History

2022: socratic.center

The Govbot project began in 2022 at socratic.center with a vision to create a destination for simplified, summarized updates on legislative action.

The initial hypothesis: *What if citizens could easily track and understand the bills being voted on?*

civi.social

We built civi.social, exploring how to make legislative information accessible and shareable on social platforms.

This experiment helped us understand how citizens wanted to engage with civic data in their existing communities.

myChicago + Jarvis

We created a prototype for what integration with the myChicago platform would look like.

The goal of Jarvis, our AI-powered assistant, would have been to help users understand legislation through:
- Simplified bill summaries
- Contextual information
- Guided engagement tools

Windy Civi: Full Launch

After reflecting on previous concepts, the Windy Civi app and website launched in beta in 2024.

The goal was to enable citizens to:
- Track bills by topic
- Receive personalized updates
- Connect directly with representatives

Rethinking Our Approach

While building these solutions, we began to ask a critical question:

What are the limitations of a centrally-managed platform?

Can it scale to serve all communities?
What happens if we stop maintaining it?
How can others build on this work?

Our New Vision

Our vision has now pivoted to building the infrastructure itself:

A decentralized legislative data data catalog
Reusable frameworks for communities to build their own tools
Sample applications demonstrating use cases

Our goal: Ensure that government accountability is accessible to all.

Live Demos

Basic Setup + Commands Querying via DuckDB Creating Social Media Bots

Basic Setup + Commands

Install via:

sh -c “$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)

Once installed, you can download and set up the data using the following commands

govbot # to see help
govbot clone # to show available datasets
govbot clone {{locale}} {{locale}} # download specific items
govbot delete {{locale}} # delete specific items
govbot delete all # delete everything
govbot load # load bill metadata into DuckDB

Querying with DuckDB

First, set up DuckDB, which creates a simulated database from the .json log files:

govbot load #Load all data into a database
govbot load –database my-bills.duckdb #Specify a custom database file
govbot load –memory-limit 32GB –threads 8 #With memory limit and thread settings

duckdb –ui govbot.duckdb #Open in DuckDB UI (opens in browser)

Once the DuckDB database is created, you can query as normal

– Load JSON extension
INSTALL json;
LOAD json;
– Query all bill metadata
SELECT *
FROM read_json_auto(’~/.govbot/repos/**/bills/*/metadata.json’)
LIMIT 10;

Technical Details

Our Open Civic Data Proposal

Democratizing government data

What does it mean to democratize government data?
Today: legislation (with room to expand to courts, agencies, and more)
To understand the solution, it helps to first understand the problem

The problem

Legislative data is commonly distributed through APIs or large database dumps. These approaches work well for transactional access, but they introduce real limitations when the goal is long-term analysis and accountability.

They make it harder to:

Perform bulk or historical analysis
Track changes over time
Analyze data without running a database server

They also introduce fragility:

APIs change or disappear
Long-term access and verification become difficult

API-based access breaks over time

Why this matters

Civic trust
Research
Accountability
Anyone can verify, not just institutions
A shared source of truth without interpretation baked in

State-based systems vs append-only logs

What we built (and why Git)

File-based structure
Bills, events, logs
Deterministic paths to find things
Built on Git for history, distribution, cheap branching, and broad accessibility
Aligned with Open States data and Open Civic Data (OCD) identifiers
Formalized through an Open Civic Data proposal

This design treats the filesystem as the primary interface for civic data.

The OCD proposal (why this matters upstream)

Makes the model reusable beyond Windy Civi
Provides shared vocabulary and structure
Enables other projects to adopt or adapt the approach

Technical challenges and triumphs

Making transformations deterministic so Git diffs remain meaningful
Interpreting and triaging state-by-state scraper errors
Passing data cleanly between CI steps (artifacts, environment variables, Docker parity)
Designing self-contained log entries that remain analyzable outside their folder context
Building a “last seen” mechanism when upstream sources return full snapshots
Identifying hard limits: PDF redlines and crossouts remain an open problem

A Dive Into Local AI Tagging

Use two models for two very different roles

Smart LLM (ChatGPT / Claude / Cursor)
- Human-in-the-loop
- Used during development
- Produces tag configuration
Small embedding model
- Fully automated
- Used in production
- Categorizes every update

The smart LLM helps write the rules The small model runs them

Step 1: Tag Authoring (Developer Workflow)

A developer sits down with:

Sample legislative updates
Court rulings
Regulatory notices

Using ChatGPT / Claude / Cursor, they prompt:

“Create a tag config for legislative bill introductions. Include examples, negative examples, and keywords.”

The output is reviewed, edited, and committed like code.

Important Clarification

The “smart” LLM is not part of production.

It is used the same way you’d use:

A code editor
A linter
A schema generator

Think of ChatGPT / Claude / Cursor as a tag authoring tool.

What the Smart LLM Actually Does

The smart LLM is used interactively by a developer to:

Define new tags
Refine descriptions
Generate examples and edge cases
Identify negative examples
Propose include / exclude keywords

It replaces manual taxonomy writing — not runtime logic.

AI-tagging-example

What’s next for the project

Building relationships with activists and journalists
Creating + designing customizable tagging templates + a system to share them
Incorporating Executive Orders, judicial opinions, and other relevant non-legislative documents
Exploring use cases for the data, such as automated content pipelines
Add donation data for analysis of legislative priorities and campaign promises

Special thanks to the following contributors:

Sartaj Chowdhury Tamara Dowis Edwin Chalas Cuevas Andrew Dauphinais Emme Kari Douglass Marissa Heffler Sartaj Chowdhury Zach Schoneman Brian Burns

Thank You!

Chi Hack Night
Open States
Open Civic Data community

Building government accountability tools accessible to all

Appendix

Contributing & Testing FAQs

Contributing & Testing

Prerequisites

Knowledge of Rust and the just task runner required.

Rust & Cargo: Install the Rust Toolchain
Just: Install the task runner: cargo install just

Development Workflow

Use just govbot ... as your CLI “dev” environment.

Useful Commands:

just - See all available tasks
just test - Run all tests
just review - Review snapshot test changes
just mocks [LOCALES...] - Update mock data for testing

Dataset Status Key

🆕 The locale’s data received updates since your last cloning
✅ Your data is up-to-date with the most current version
🔄 The data is currently being updated
❌ The data is not currently accessible

FAQs: Repositories

Can I See The Repo?

Main repo: windy-civi/windy-civi
Toolkit repo: chihacknight/govbot

FAQs: Data Structure

How Is The Data Structured?

Find the file format structure and .json schema in the readme.md: DATA_STRUCTURES.md

FAQs: Cloning Data

How Do I Clone This Data?

Each locale is scraped using a GitHub Actions template explained here: README_TEMPLATE.md

To manage multiple pipelines or locales, see our pipeline manager documentation

Stay Connected

How Can I Stay Updated, Or Get In Touch?

Follow our work at Chi Hack Night
Check commits and updates on GitHub
Visit our Docs page
Join the Chi Hack Night Slack

2025 Bill Blockchain

Open Civic Data Blockchain Proposal

This proposal outlines a decentralized, peer-to-peer system for managing and publishing civic data using a blockchain-like append-only log. Built on the Open Civic Data schema and powered by Git, this architecture enables transparency, tamper-resistance, and flexibility in how public information is stored, shared, and consumed. By treating government data as a series of verifiable, timestamped events, we create an ecosystem where organizations and individuals can build custom civic feeds, automate updates, and uncover hidden dynamics in governance—all without relying on centralized servers.

Why Use a Hashed Append-Only Log?

🔐 Truly Peer-to-Peer
Everyone keeps their own copy of the data—no central server needed, no extra cost.
📜 The Constitution Is Basically a Blockchain
Government changes through amendments. Our log reflects this: permanent, append-only, and transparent.
💻 Highly Tailored Custom Feeds Built With Code + AI
Composable event logs will be easy to filter, tag, and summarize. Orgs can compose those feeds too in order to make highly tailored feeds for publishing.
🤖 Publish Everywhere with Bots
Organizations can automate updates to any number of platforms easily, from Blue Sky Bot Alert posters —think Reddit replies or Bluesky posts—on top of each other. In addition, we can make tooling to have public RSS feeds that can then be imported by news organizations.
⛓️ Blockchain without the Cringe or Cost
Blockchain hashes + public key signatures let users verify data themselves without expensive proof algorithms. For IDs, Decentralized Identifiers are the new standard, and interop with Bluesky.
☎️ Network Agnostic
Supports everything: peer-to-peer, pub-sub, polling, WebRTC, email, RSS, push—notifications, etc. They will all work naturally.
📱 Our App Becomes A Glorified P2P Feed Reader With Civic Tendencies
By being a P2P feed reader with special features around civic data, we simplify the app itself, and allow others to make their own client apps.
🛜 RSS Feeds Just Work
Feed-based design lets us easily pull in existing sources like Executive Orders or court decisions via RSS, and allows organizations to pull news website feeds.
⏪ Bonus: Reveal Power Dynamics
Replay legislative logs to uncover hidden patterns—who votes when, with whom, and under whose influence.

Why Open Civic Data as the Base Schema?

🤝 Plug Into the Civic Tech Ecosystem
Uses familiar Open Civic Data formats, making it easy to integrate with existing tools and scrapers.
🔄 Reuse Existing Data
Works with platforms like OpenStates and Councilmatic, giving us access to many data sources.

Why Git for Data Storage?

📁 Folders + Files = Maximum Portability
The most universal data structure—easy to read, edit, and share across tools and platforms.
🔄 Git Is Already Peer-to-Peer
Git is built on a distributed log. git pull works seamlessly in our app and AI workflows.
🌐 GitHub = Easy Browsing
Markdown rendering and file previews make GitHub a friendly UI for exploring without needing to clone. We can also expose RSS feeds via GHPages.
🧩 Submodules Keep Repos Lean
Git submodules let us split large datasets across repos, so no single repo gets bloated.

Folder Structure + Filename Convention

/open-civic-data-blockchain/
├── country:us/                                 # United States
│   ├── state:il/                               # Illinois state
│   │   ├── sessions/                           # Legislative sessions
│   │   │   ├── ocd-session/country:us/state:il/2023-2024/  # Full OCD session ID
│   │   │   │   ├── bills/                      # Bills in this session
│   │   │   │   │   ├── sb1234/                 # Senate Bill 1234
│   │   │   │   │   │   ├── logs/               # Event logs folder
│   │   │   │   │   │   │   ├── 20240115T123045Z_session_bill_created.json  # Initial bill creation in session
│   │   │   │   │   │   │   ├── 20240115T123045Z_metadata_created.json      # Initial metadata creation
│   │   │   │   │   │   │   ├── 20240117T143022Z_metadata_updated.json      # Metadata update with field mask
│   │   │   │   │   │   │   ├── 20240117T143156Z_sponsor_added.json         # Sponsors added
│   │   │   │   │   │   │   ├── 20240120T092133Z_version_added.json         # Version document added
│   │   │   │   │   │   │   ├── 20240130T152247Z_action_added.json          # Action recorded
│   │   │   │   │   │   │   ├── 20240215T103045Z_doc_added.json             # Supporting document added
│   │   │   │   │   │   │   ├── 20240315T140011Z_vote_initiated.json        # Vote started
│   │   │   │   │   │   │   ├── 20240315T143022Z_vote_updated.json          # Vote partial results
│   │   │   │   │   │   │   └── 20240315T150537Z_vote_finalized.json        # Vote complete
│   │   │   │   │   │   └── files/              # Raw file storage
│   │   │   │   │   │       ├── bill_introduced.pdf      # Original version document
│   │   │   │   │   │       ├── bill_amended.pdf         # Amended version document
│   │   │   │   │   │       └── fiscal_note.pdf          # Supporting document
│   │   │   │   │   ├── hb0789/                 # House Bill 789
│   │   │   │   │   │   ├── logs/               # Event logs folder
│   │   │   │   │   │   │   ├── 20240118T090023Z_session_bill_created.json  # Initial bill creation in session
│   │   │   │   │   │   │   ├── 20240118T090023Z_metadata_created.json      # Initial metadata creation
│   │   │   │   │   │   │   └── ...
│   │   │   │   │   │   └── files/              # Raw file storage
│   │   │   │   │   │       └── ...
│   │   │   │   │   └── ...
│   │   │   │   └── events/                     # Events for this session
│   │   │   │       ├── 2024-04-15-senate-appropriations-hearing.json  # Senate committee hearing
│   │   │   │       ├── 2024-02-22-house-floor-session.json            # House floor session
│   │   │   │       └── ...
│   │   │   ├── ocd-session/country:us/state:il/2021-2022/  # Previous session
│   │   │   │   └── ...
│   │   │   └── ...
│   │   └── events/                            # Events not tied to a specific session
│   │       ├── 2024-07-15-joint-commission-meeting.json  # Joint commission meeting
│   │       ├── 2024-08-20-special-task-force.json        # Special task force meeting
│   │       └── ...
│   ├── state:ca/                               # California state
│   │   └── ...
│   └── state:ny/                               # New York state
│       └── ...
└── country:ca/                                 # Canada
    └── ...

Git Architecture

We plan to auto-generate many git repos.

Session Git Repo

This repo should be a blockchain-like append only log, making syncing data as easy as git pull.

Question: what about the files like PDFS? They feel right to keep in here as a copy, but also, would balloon the size of these. Maybe yet another submodule for session files?

/
├── README.md                  # Session-specific information
├── bills/                     # Bills in this session
│   ├── sb1234/                # Senate Bill 1234
│   │   ├── logs/              # Event logs folder
│   │   │   ├── 20240115T123045Z_session_bill_created.json
│   │   │   ├── 20240115T123045Z_metadata_created.json
│   │   │   ├── 20240117T143022Z_metadata_updated.json
│   │   │   └── ...
│   │   └── files/             # Raw file storage
│   │       ├── bill_introduced.pdf
│   │       ├── bill_amended.pdf
│   │       └── fiscal_note.pdf
│   ├── hb0789/                # House Bill 789
│   │   ├── logs/
│   │   │   └── ...
│   │   └── files/
│   │       └── ...
│   └── ...
└── events/                    # Events for this session
    ├── 2024-04-15-senate-appropriations-hearing.json
    ├── 2024-02-22-house-floor-session.json
    └── ...

Locale Git Repo

Overall locale repo (also generated). Contain links to git submodules that have event logs for different sessions/events. Will also contain scripts to rebuild data into Open Civic Data formats.

ocd-blockchain-illinois/
├── .gitmodules
├── README.md
├── scripts/
│   ├── scrape.py # Shortcut to directly scrape for this locale
|   └── rebuild.py # To rebuild OCD data from blockchain logs
├── sessions/
│   ├── ocd-blockchain-illinois/ocd-session/country:us/state:il/2023-2024/
│   ├── ocd-blockchain-illinois/ocd-session/country:us/state:il/2021-2022/
│   └── ocd-blockchain-illinois/ocd-session/country:us/state:il/2019-2020/
└── events/
   ├── 2022-2026/
   ├── 2018-2022/
   └── 2014-2018/

Main Repo

The primary repo (also generated) that people can clone to get all civic data easily via the submodules.

open-civic-data-blockchain/
├── .gitmodules
├── README.md
├── scripts/
│   ├── update_all.sh
│   ├── integrity_check.py
│   └── generate_cross_jurisdictional_report.py
└── jurisdictions/
    ├── country:us/
    │   ├── state:il/                           # Illinois submodule
    │   ├── state:ca/                           # California submodule
    │   ├── state:ny/                           # New York submodule
    │   ├── district:dc/                        # Washington DC submodule
    │   ├── county:us/state:va/fairfax/         # Fairfax County submodule
    │   └── place:us/state:tx/austin/           # City of Austin submodule
    ├── country:ca/
    │   ├── province:on/                        # Ontario province submodule
    │   └── province:bc/                        # British Columbia submodule
    └── country:uk/
        ├── england/                            # England submodule
        └── scotland/                           # Scotland submodule

TODO List

Timestamps: Scrape-Oriented vs. Gov-Oriented
Are log timestamps the time we scraped the data, or the time of the actual government update?
What if a specific event doesn’t have a timestamp?
➤ Open Civic Data also discussed this
Unique IDs
OpenStates uses a lot of generated UUIDs. Ideally, our folder/file structure and naming conventions should follow official legislative data.
- Jurisdiction ID: Follows OCD naming convention — country:us/state:fl/government
- Session ID: TODO
- Bill ID: jurisdiction_id/sessions/:session_id/bill.identifier — use official ID like HB250
- Vote Event ID: TODO
- Person ID: TODO
- Event ID: TODO
Bill Folder + Filename Convention
- bill.metadata: bill_id/log/metadata_update_{TODO}.json
- bill.actions: bill_id/log/action_{TODO}.json
- bill.votes: bill_id/log/vote_{TODO}.json
- bill.sponsors: bill_id/log/sponsor_update_{TODO}.json
- bill.versions:
  - File: bill_id/files/version_{TODO}.pdf
  - Log: bill_id/log/version_add_{TODO}.json (we can extract PDF content to JSON)
- bill.documents:
  - File: bill_id/files/documents_{TODO}.pdf
  - Log: bill_id/log/document_add_{TODO}.json (we can extract PDF content to JSON)
Event Folder Convention
Events tied to sessions should live inside the session folder.
Out-of-session events: can we define a reliable alternate time span for organization?
How to Handle Metadata Changes
Metadata (like bill) may change from scrape to scrape.
Use fieldMask for lightweight updates, or consider JSON Patch.
➤ https://jsonpatch.com
```
// bill.metadata_events
{
  "fieldMask": ["from_organization"],
  "bill": {
    "from_organization": ""
  }
}
```

Environment Setup

For now, we aren’t doing any coding that touches the previous code. All code/decisions should be in this scraper_next folder as an isolated experiment. If you don’t have git access, message @sartaj.

Easy: Download Data and Explore With SQL Explorers

OpenState Illinois Scraper Output Files
State/Federal OpenStates Data Explorer
- password is ChiHackNight closing group phrase all lowercase
Chicago OCD Data Explorer Explore Councilmatic PG Dump for Chicago OCD data

Advanced: Running Scrapers / Importing PG Dumps

Open States
- via Scraper. We are using this for v1. By running the scrapers directly, data will be much more up to date as it scrapes data directly. It also allow us to run certain scrapers, like USA, multiple times a day.
- via SQL Dump, which updates every few days, and has bill full text, in addition to a lot of other content like maps data.
Chicago SQL Dump. This updates every night and is managed by Datamade, who we have already been collaborating with on Chicago data. They also do stuff like AI summaries that we can pre-pull.

Prior Art

Washington DC made Github their official law source of truth. It looks immutable.
How append-only logs are used in p2p/blockchain applications.
Beginners guide to event sourced databases and their benefits.
Bluesky LGBTQ+ Legislation Alerts This incredible team has manually created a system that I think we can make tooling for that they would potentially want to use.

Communications

Discussion via Slack
Task Board via Slack
(this file) Collaborative Brainstorming via Git: Feel free to edit.

Bill Bot Designer

Overview

The Bill Bot Designer is a tool for creating automated bots that monitor and publish legislative updates from the Open Civic Data Blockchain. These bots can be configured to watch specific bills, jurisdictions, or events and automatically post updates to various platforms like Bluesky, Twitter, RSS feeds, or custom webhooks.

Why Bots?

🤖 Automated Monitoring: Bots can continuously watch for legislative changes without human intervention
📢 Multi-Platform Publishing: Single bot configuration can publish to multiple platforms simultaneously
🎯 Targeted Alerts: Organizations can create highly specific feeds for their constituents
⚡ Real-Time Updates: Instant notifications when important legislative events occur
🔄 Consistent Formatting: Standardized message formats across all platforms

Bot Architecture

Event-Driven Design

Bots operate on an event-driven architecture, listening to the append-only log of legislative events:

Legislative Event → Blockchain Log → Bot Filter → Message Generation → Platform Publishing

Bot Components

Event Listener: Monitors the blockchain log for new events
Filter Engine: Applies rules to determine if an event should trigger the bot
Message Generator: Creates platform-specific messages from event data
Publisher: Sends messages to configured platforms
Rate Limiter: Ensures compliance with platform API limits

Configuration Examples

Basic Bill Monitor Bot

name: "Illinois Bill Monitor"
description: "Monitors all Illinois bills for key actions"

# Event filtering
filters:
  - jurisdiction: "country:us/state:il"
  - event_types: ["bill_introduced", "bill_passed", "bill_vetoed"]
  - keywords: ["environment", "education", "healthcare"]

# Message template
message_template: |
  📋 {bill.identifier}: {bill.title}
  🏛️ {action.description}
  📅 {action.date}
  🔗 {bill.url}

# Publishing platforms
platforms:
  - type: "bluesky"
    account: "@legislative-alerts.bsky.social"
    rate_limit: "10/hour"
  
  - type: "rss"
    feed_url: "https://example.com/il-bills.xml"
    update_frequency: "immediate"

Specialized Committee Bot

name: "Senate Appropriations Monitor"
description: "Tracks all bills going through Senate Appropriations"

filters:
  - jurisdiction: "country:us/state:il"
  - committee: "Senate Appropriations"
  - event_types: ["bill_referred", "bill_hearing_scheduled", "bill_vote"]

message_template: |
  💰 Senate Appropriations Update
  📋 {bill.identifier}: {bill.title}
  📊 Fiscal Impact: {bill.fiscal_note.summary}
  📅 Next Action: {next_action.description}
  🗓️ Date: {next_action.date}

platforms:
  - type: "webhook"
    url: "https://api.example.com/appropriations-webhook"
    headers:
      Authorization: "Bearer {webhook_token}"
  
  - type: "email"
    recipients: ["budget@example.org", "finance@example.org"]
    subject: "Senate Appropriations Alert: {bill.identifier}"

Constituent Alert Bot

name: "District 5 Constituent Alerts"
description: "Alerts constituents about bills affecting their district"

filters:
  - jurisdiction: "country:us/state:il"
  - sponsor_district: "5"
  - event_types: ["bill_introduced", "bill_passed", "bill_signed"]

message_template: |
  🏠 District 5 Update
  📋 {bill.identifier}: {bill.title}
  👤 Sponsored by: {sponsor.name}
  📝 Summary: {bill.summary}
  📅 Status: {bill.status}
  🔗 Learn more: {bill.url}

platforms:
  - type: "sms"
    phone_numbers: ["+15551234567", "+15559876543"]
    provider: "twilio"
  
  - type: "slack"
    channel: "#district-5-alerts"
    workspace: "example-org"

Platform Integrations

Bluesky

Rate Limit: 10 posts per hour
Character Limit: 300 characters
Features: Rich text, links, images
Authentication: App password required

Twitter/X

Rate Limit: 300 tweets per 3 hours
Character Limit: 280 characters
Features: Text, images, polls
Authentication: OAuth 2.0

RSS Feeds

Format: RSS 2.0 or Atom
Update Frequency: Configurable
Features: Full text, categories, enclosures
Hosting: GitHub Pages, custom server

Webhooks

Method: POST
Content-Type: application/json
Authentication: Bearer token or API key
Retry Logic: Exponential backoff

Email

Providers: SMTP, SendGrid, Mailgun
Templates: HTML and plain text
Attachments: PDF bills, documents
Rate Limits: Varies by provider

SMS

Providers: Twilio, AWS SNS
Character Limit: 160 characters
Features: Text only
Cost: Per message

Advanced Features

Conditional Logic

filters:
  - jurisdiction: "country:us/state:il"
  - conditions:
      - if: "bill.fiscal_impact > 1000000"
        then: "priority = high"
      - if: "bill.sponsor.party == 'Republican'"
        then: "include_opposition_analysis = true"

Message Templates with Variables

message_template: |
  {#if bill.fiscal_impact > 1000000}💰 HIGH COST BILL {/if}
  📋 {bill.identifier}: {bill.title}
  👤 Sponsor: {bill.sponsors[0].name} ({bill.sponsors[0].party})
  📊 Fiscal Impact: ${bill.fiscal_impact:,.0f}
  📅 {action.date | date_format: "%B %d, %Y"}
  🔗 {bill.url}
  
  {#if bill.summary}
  📝 {bill.summary | truncate: 200}
  {/if}

Scheduled Publishing

publishing:
  schedule:
    - time: "09:00"
      timezone: "America/Chicago"
      days: ["monday", "tuesday", "wednesday", "thursday", "friday"]
    - time: "17:00"
      timezone: "America/Chicago"
      days: ["monday", "tuesday", "wednesday", "thursday", "friday"]
  
  batch_size: 5
  delay_between_posts: "30s"

Analytics and Monitoring

analytics:
  track_engagement: true
  platforms:
    - bluesky
    - twitter
    - webhook
  
  metrics:
    - posts_sent
    - engagement_rate
    - error_rate
    - response_time
  
  alerts:
    - condition: "error_rate > 0.05"
      action: "email_admin"
    - condition: "no_posts_24h"
      action: "slack_alert"

Best Practices

Content Guidelines

Be Accurate: Always verify data before publishing
Stay Neutral: Present information without bias
Include Context: Provide background information when relevant
Use Clear Language: Avoid jargon and technical terms
Include Sources: Always link to official sources

Technical Guidelines

Rate Limiting: Respect platform API limits
Error Handling: Implement retry logic and fallbacks
Monitoring: Track bot performance and errors
Testing: Test configurations before going live
Documentation: Document bot purposes and configurations

Legal Considerations

Copyright: Respect copyright on bill text and documents
Attribution: Always credit original sources
Disclaimers: Include appropriate disclaimers
Compliance: Follow platform terms of service
Privacy: Don’t collect or store personal information

Getting Started

1. Choose Your Use Case

General Monitoring: Track all bills in a jurisdiction
Committee Focus: Monitor specific committees
Issue-Based: Track bills by topic or keywords
Constituent Service: Alert constituents about relevant bills

2. Design Your Filters

Jurisdiction: Which government body to monitor
Event Types: What actions to track
Keywords: Specific topics or terms
Sponsors: Bills from specific legislators

3. Create Your Message Template

Platform Limits: Consider character limits
Required Information: Bill ID, title, action, date
Optional Details: Sponsor, summary, fiscal impact
Call to Action: Links to learn more or take action

4. Configure Platforms

Primary Platform: Choose your main publishing platform
Secondary Platforms: Add additional platforms for reach
Testing: Test with a small audience first
Monitoring: Set up alerts and analytics

5. Deploy and Monitor

Gradual Rollout: Start with limited scope
Monitor Performance: Track engagement and errors
Iterate: Refine based on feedback and data
Scale: Expand to additional jurisdictions or topics

Examples in Action

Bluesky Legislative Alerts

The Bluesky LGBTQ+ Legislation Alerts bot demonstrates how effective automated legislative monitoring can be. It:

Monitors bills across multiple states
Filters for LGBTQ+ related legislation
Posts concise, informative updates
Builds a community around legislative transparency

Chicago Councilmatic

The Chicago Councilmatic system shows how bots can enhance existing civic data platforms:

Integrates with existing Open Civic Data sources
Provides real-time updates on city council activities
Maintains historical records of all legislative actions
Enables custom feeds for different stakeholders

Future Enhancements

AI-Powered Features

Smart Summaries: AI-generated bill summaries
Impact Analysis: Automated analysis of bill effects
Sentiment Analysis: Track public opinion on bills
Predictive Modeling: Forecast bill outcomes

Advanced Integrations

Calendar Integration: Add events to personal calendars
CRM Integration: Track constituent interactions
Newsletter Integration: Compile weekly summaries
API Access: Allow third-party integrations

Enhanced Analytics

Engagement Tracking: Measure bot effectiveness
A/B Testing: Test different message formats
Audience Insights: Understand who’s following bots
Performance Optimization: Improve delivery rates

Windy Civi

A unified portal with notifications for Chicago residents, showing local, state, and federal bills with AI summaries and topics, allowing users to get notifications.

Charles Cole
Sue Kwong

History

How Did We Get Here

Civi Social Site

Defined Vision

Socratic.Center

Easily find who represents you.

This era involved making www.socratic.center, a site to easily find your representative. From here, this project merged into Chi Hack Night as a breakout group.

Contributors

Sartaj Chowdhury

History

Find your rep

Keyboard shortcuts

govbot

Why Open Civic Data as the Base Schema?