
Mission
Why don’t we pay attention to our representatives between elections?
Legislative data is hard to parse, track, and organize. Activists, concerned citizens, and the curious may not have the time, resources, or expertise to build out duplicative tech stacks. Existing solutions may be limited by the willingness of organizations and companies to continue to run and host them - such as in the case of Google’s Civic Information API, which was shut down earlier this year. What would a decentralized, open-source legislative data solution look like?
The Govbot team’s goal is to bridge this gap - building the framework for the building and use of federated, open-source, non-profit legislative data. Built as a Chi Hack Night Breakout Group, the project includes an open-source, simplified, and expanded version of OpenStates’ data on state and federal legislation, as well as example applications.
What We Offer
The main Govbot dataset currently includes legislative updates from bills in the U.S. House & Senate, all 50 states, territories like Guam, and the city of Chicago, as .json files organized using the Project Open Data catalog format. The Govbot scrapers update regularly, appending new logs, and then running them through Claude to provide topic-based tagging and summaries. This data can then be analyzed using SQL, via an interface built with DuckDB, or plugged into applications like our example website, WindyCivi, and a test BlueSky bot built in collaboration with U.S. Representative Hoan Huynh. (https://bsky.app/profile/test-hoan-huynh.bsky.social).
How Do I Use It?
1. Install
sh -c "$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)"
2. Run govbot
govbot
That’s it. If no govbot.yml exists, an interactive wizard walks you through setup:
- Sources - Choose all 47 states or pick specific ones
- Tags - Start with an example tag, or get an AI prompt you can copy-paste to create your own
- Publishing - RSS feeds configured automatically
The wizard creates govbot.yml, .gitignore, and a GitHub Actions workflow.
3. Run the pipeline
Once set up, running govbot again executes the full pipeline:
- Clones/updates legislation repositories
- Tags bills based on your tag definitions
- Generates RSS feeds in the
docs/directory
Other Commands
govbot clone all # download all state legislation datasets
govbot clone il ca ny # download specific states
govbot logs # stream legislative activity as JSON Lines
govbot logs | govbot tag # process and tag data
govbot build # generate RSS feeds
govbot load # load bill metadata into DuckDB database
govbot delete all # remove all downloaded data
govbot update # update govbot to latest version
govbot --help # see all commands and options
Dataset Key:
- 🆕: the locale’s data received updates since your last cloning
- ✅: the data you’ve cloned is up-to-date with the most current version
- 🔄: the data is currently being updated
- ❌: the data is not currently accessible
Querying in SQL using DuckDB
You can query the data using SQL, via DuckDB, which creates a simiulated database from the .json log files. See DUCKDB.md for more details.
Running Queries in the Command Line
-- Load JSON extension
INSTALL json;
LOAD json;
-- Query all bill metadata
SELECT *
FROM read_json_auto('~/.govbot/repos/**/bills/*/metadata.json')
LIMIT 10;
Additional Commands, and Querying via the Web UI
Additional examples of commands, and setup for the web UI, can be found below:
# Load all data into a database (default: govbot.duckdb)
govbot load
# Or specify a custom database file
govbot load --database my-bills.duckdb
# With memory limit and thread settings
govbot load --memory-limit 32GB --threads 8
# Open in DuckDB UI (opens in your browser)
duckdb --ui govbot.duckdb
Helper Scripts
# Run example queries
./duckdb-query.sh examples/duckdb-example.sql
Contributing & Testing
Prerequisites
Folks looking to contirbute should have knowledge of Rust: just. just setup to start, and then just govbot ... to develop the cli.
The following should also be installed:
- Rust & Cargo: Install the Rust Toolchain
- Just: Install the task runner:
cargo install just
Development Workflow
Use just govbot ... as your cli “dev” environment.
Other Useful Commands
just- See all available tasksjust test- Run all testsjust review- Review snapshot test changesjust mocks [LOCALES...]- Update mock data for testing
We build snapshots off examples. Add examples to make a test.
Advanced
GOVBOT_REPO_URL_TEMPLATE="https://gitsite.com/org/{locale}.git" govbot ...
Project History
The Govbot project began in 2022, with a vision to create a destination for simplified, summarized updates on legislative action, with the ability to follow or filter for certain legislative topics. The result was the initial Windy Civi app, and website, launched in beta in 2024.
While building the solution, the team began to consider the limitations of a centrally-managed data source and platform, versus one that could be decentralized, that was open-source, and that allowed for exploration and use of the data in ways beyond initial designs.
Our vision now has pivoted to building that data set, as well as building sample applications and solutions to ensure that government accountability can be accessible to all.
FAQs
Can I See The Repo?
Yes! Our main repo can be found here. The repo that is being used to run and store the data - the ‘toolkit’ repo - can be found here.
How Is The Data Structured?
You an find the file format structure and .json schema in the readme.md located here.
How Do I Clone This Data?
Each locale is scaped using a GitHub Actions tempate that is defined and explained in detail here. You can follow this template to create a new repository of locale data.
To help manage multiple pipelines or locales, look at our pipeline manager documentation
How Can I Stay Updated, Or Get In Touch?
You can stay updated by following our work at Chi Hack Night, as well as on the related Slack (see below). You can also follow our commits and updates on Github and this Docs page,
You can message us on the Chi Hack Night Slack - we have our own channel.
![]()
Let The People Take Back Their Government
Our 2024 efforts taught us a few things:
- People don’t want to download another app or use another portal.
- Getting bill data is really hard, and usually involves using private APIs.
- Creating AI summaries and topics should be controlled by organizations/individuals.
As such, our effort has become 2-fold
Contributors
- Austin McLaughlin
- Brandon
- Daniel Cappy
- Edwin Cuevas
- Jeff Leverenz
- Sartaj Chowdhury
- S.Murakami
- Tamara Dowis
2025 Deck – Democratic Infrastructure
marp: true theme: default paginate: true

Govbot
Federated, open-source legislative data for everyone
Overview
The Problem Our Solultion What We Offer Features Setup + Core Functions
The Problem
Why don’t we pay attention to our representatives between elections?
Legislative data is hard to parse, track, and organize. Activists, concerned citizens, and the curious may not have the time, resources, or expertise to build out duplicative tech stacks.
The Problem (cont.)
Existing solutions may be limited by the willingness of organizations and companies to continue to run and host them - such as in the case of Google’s Civic Information API, which was shut down earlier this year.
What would a decentralized, open-source legislative data solution look like?
Our Solution
The Govbot team’s goal is to bridge this gap - building the framework for federated, open-source, non-profit legislative data.
Built as a Chi Hack Night Breakout Group, this project offers frameworks and tools built on top of OpenStates’ data on state and federal legislation.
What We Offer
The main Govbot dataset currently includes legislative updates from:
- the U.S. House & Senate
- Legislatures from all 50 states
- Legislatures from U.S. territories
Data is organized as .json files using the Project Open Data catalog format, scraped and appended regularly.
Features
- A decentralized, regularly updating, legislative data catalog
- AI-powered, topic-based tagging and summaries, customized using .yml
- SQL querying via DuckDB interface
- Example applications, like custom websites (see our demo WindyCivi site), and social media bots (see our BlueSky bot, made in collaboration with U.S. Representative Hoan Huynh)
Setup
You can download the setup script via one-line install, from our GitHub repository:
sh -c “$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)
Core Functions
Once installed, you can:
- Clone the entire dataset
- Clone specific items (state, session, or bill)
- Load metadata into a SQL-accessible DuckDB database
Project History
2022: socratic.center
The initial hypothesis: *What if citizens could easily track and understand the bills being voted on?*
civi.social
This experiment helped us understand how citizens wanted to engage with civic data in their existing communities.
myChicago + Jarvis
The goal of Jarvis, our AI-powered assistant, would have been to help users understand legislation through:
- Simplified bill summaries
- Contextual information
- Guided engagement tools
Windy Civi: Full Launch
The goal was to enable citizens to:
- Track bills by topic
- Receive personalized updates
- Connect directly with representatives
Rethinking Our Approach
While building these solutions, we began to ask a critical question:
What are the limitations of a centrally-managed platform?
- Can it scale to serve all communities?
- What happens if we stop maintaining it?
- How can others build on this work?
Our New Vision
Our vision has now pivoted to building the infrastructure itself:
- A decentralized legislative data data catalog
- Reusable frameworks for communities to build their own tools
- Sample applications demonstrating use cases
Our goal: Ensure that government accountability is accessible to all.
Live Demos
Basic Setup + Commands Querying via DuckDB Creating Social Media Bots
Basic Setup + Commands
Install via:
sh -c “$(curl -fsSL https://raw.githubusercontent.com/chihacknight/govbot/main/actions/govbot/scripts/install-nightly.sh)
Once installed, you can download and set up the data using the following commands
govbot # to see help
govbot clone # to show available datasets
govbot clone {{locale}} {{locale}} # download specific items
govbot delete {{locale}} # delete specific items
govbot delete all # delete everything
govbot load # load bill metadata into DuckDB
Querying with DuckDB
First, set up DuckDB, which creates a simulated database from the .json log files:
govbot load #Load all data into a database
govbot load –database my-bills.duckdb #Specify a custom database file
govbot load –memory-limit 32GB –threads 8 #With memory limit and thread settings
duckdb –ui govbot.duckdb #Open in DuckDB UI (opens in browser)
Once the DuckDB database is created, you can query as normal
– Load JSON extension
INSTALL json;
LOAD json;
– Query all bill metadata
SELECT *
FROM read_json_auto(’~/.govbot/repos/**/bills/*/metadata.json’)
LIMIT 10;
Creating Social Media Bots
Technical Details
Our Open Civic Data Proposal
Democratizing government data
- What does it mean to democratize government data?
- Today: legislation (with room to expand to courts, agencies, and more)
- To understand the solution, it helps to first understand the problem
The problem
Legislative data is commonly distributed through APIs or large database dumps. These approaches work well for transactional access, but they introduce real limitations when the goal is long-term analysis and accountability.
They make it harder to:
- Perform bulk or historical analysis
- Track changes over time
- Analyze data without running a database server
They also introduce fragility:
- APIs change or disappear
- Long-term access and verification become difficult

Why this matters
- Civic trust
- Research
- Accountability
- Anyone can verify, not just institutions
- A shared source of truth without interpretation baked in

What we built (and why Git)
- File-based structure
- Bills, events, logs
- Deterministic paths to find things
- Built on Git for history, distribution, cheap branching, and broad accessibility
- Aligned with Open States data and Open Civic Data (OCD) identifiers
- Formalized through an Open Civic Data proposal
This design treats the filesystem as the primary interface for civic data.
The OCD proposal (why this matters upstream)
- Makes the model reusable beyond Windy Civi
- Provides shared vocabulary and structure
- Enables other projects to adopt or adapt the approach
Technical challenges and triumphs
- Making transformations deterministic so Git diffs remain meaningful
- Interpreting and triaging state-by-state scraper errors
- Passing data cleanly between CI steps (artifacts, environment variables, Docker parity)
- Designing self-contained log entries that remain analyzable outside their folder context
- Building a “last seen” mechanism when upstream sources return full snapshots
- Identifying hard limits: PDF redlines and crossouts remain an open problem
A Dive Into Local AI Tagging
Use two models for two very different roles
- Smart LLM (ChatGPT / Claude / Cursor)
- Human-in-the-loop
- Used during development
- Produces tag configuration
- Small embedding model
- Fully automated
- Used in production
- Categorizes every update
The smart LLM helps write the rules The small model runs them
Step 1: Tag Authoring (Developer Workflow)
A developer sits down with:
- Sample legislative updates
- Court rulings
- Regulatory notices
Using ChatGPT / Claude / Cursor, they prompt:
“Create a tag config for legislative bill introductions. Include examples, negative examples, and keywords.”
The output is reviewed, edited, and committed like code.
Important Clarification
The “smart” LLM is not part of production.
It is used the same way you’d use:
- A code editor
- A linter
- A schema generator
Think of ChatGPT / Claude / Cursor as a tag authoring tool.
What the Smart LLM Actually Does
The smart LLM is used interactively by a developer to:
- Define new tags
- Refine descriptions
- Generate examples and edge cases
- Identify negative examples
- Propose include / exclude keywords
It replaces manual taxonomy writing — not runtime logic.

What’s next for the project
- Building relationships with activists and journalists
- Creating + designing customizable tagging templates + a system to share them
- Incorporating Executive Orders, judicial opinions, and other relevant non-legislative documents
- Exploring use cases for the data, such as automated content pipelines
- Add donation data for analysis of legislative priorities and campaign promises
Special thanks to the following contributors:
Sartaj Chowdhury Tamara Dowis Edwin Chalas Cuevas Andrew Dauphinais Emme Kari Douglass Marissa Heffler Sartaj Chowdhury Zach Schoneman Brian Burns
Thank You!
- Chi Hack Night
- Open States
- Open Civic Data community
Building government accountability tools accessible to all
![]()
Appendix
Contributing & Testing FAQs
Contributing & Testing
Prerequisites
Knowledge of Rust and the just task runner required.
- Rust & Cargo: Install the Rust Toolchain
- Just: Install the task runner:
cargo install just
Development Workflow
Use just govbot ... as your CLI “dev” environment.
Useful Commands:
just- See all available tasksjust test- Run all testsjust review- Review snapshot test changesjust mocks [LOCALES...]- Update mock data for testing
Dataset Status Key
- 🆕 The locale’s data received updates since your last cloning
- ✅ Your data is up-to-date with the most current version
- 🔄 The data is currently being updated
- ❌ The data is not currently accessible
FAQs: Repositories
Can I See The Repo?
- Main repo: windy-civi/windy-civi
- Toolkit repo: chihacknight/govbot
FAQs: Data Structure
How Is The Data Structured?
Find the file format structure and .json schema in the readme.md: DATA_STRUCTURES.md
FAQs: Cloning Data
How Do I Clone This Data?
Each locale is scraped using a GitHub Actions template explained here: README_TEMPLATE.md
To manage multiple pipelines or locales, see our pipeline manager documentation
Stay Connected
How Can I Stay Updated, Or Get In Touch?
- Follow our work at Chi Hack Night
- Check commits and updates on GitHub
- Visit our Docs page
- Join the Chi Hack Night Slack
2025 Bill Blockchain
Open Civic Data Blockchain Proposal
This proposal outlines a decentralized, peer-to-peer system for managing and publishing civic data using a blockchain-like append-only log. Built on the Open Civic Data schema and powered by Git, this architecture enables transparency, tamper-resistance, and flexibility in how public information is stored, shared, and consumed. By treating government data as a series of verifiable, timestamped events, we create an ecosystem where organizations and individuals can build custom civic feeds, automate updates, and uncover hidden dynamics in governance—all without relying on centralized servers.
Why Use a Hashed Append-Only Log?
-
🔐 Truly Peer-to-Peer
Everyone keeps their own copy of the data—no central server needed, no extra cost. -
📜 The Constitution Is Basically a Blockchain
Government changes through amendments. Our log reflects this: permanent, append-only, and transparent. -
💻 Highly Tailored Custom Feeds Built With Code + AI
Composable event logs will be easy to filter, tag, and summarize. Orgs can compose those feeds too in order to make highly tailored feeds for publishing. -
🤖 Publish Everywhere with Bots
Organizations can automate updates to any number of platforms easily, from Blue Sky Bot Alert posters —think Reddit replies or Bluesky posts—on top of each other. In addition, we can make tooling to have public RSS feeds that can then be imported by news organizations. -
⛓️ Blockchain without the Cringe or Cost
Blockchain hashes + public key signatures let users verify data themselves without expensive proof algorithms. For IDs, Decentralized Identifiers are the new standard, and interop with Bluesky. -
☎️ Network Agnostic
Supports everything: peer-to-peer, pub-sub, polling, WebRTC, email, RSS, push—notifications, etc. They will all work naturally. -
📱 Our App Becomes A Glorified P2P Feed Reader With Civic Tendencies
By being a P2P feed reader with special features around civic data, we simplify the app itself, and allow others to make their own client apps. -
🛜 RSS Feeds Just Work
Feed-based design lets us easily pull in existing sources like Executive Orders or court decisions via RSS, and allows organizations to pull news website feeds. -
⏪ Bonus: Reveal Power Dynamics
Replay legislative logs to uncover hidden patterns—who votes when, with whom, and under whose influence.
Why Open Civic Data as the Base Schema?
-
🤝 Plug Into the Civic Tech Ecosystem
Uses familiar Open Civic Data formats, making it easy to integrate with existing tools and scrapers. -
🔄 Reuse Existing Data
Works with platforms like OpenStates and Councilmatic, giving us access to many data sources.
Why Git for Data Storage?
-
📁 Folders + Files = Maximum Portability
The most universal data structure—easy to read, edit, and share across tools and platforms. -
🔄 Git Is Already Peer-to-Peer
Git is built on a distributed log.git pullworks seamlessly in our app and AI workflows. -
🌐 GitHub = Easy Browsing
Markdown rendering and file previews make GitHub a friendly UI for exploring without needing to clone. We can also expose RSS feeds via GHPages. -
🧩 Submodules Keep Repos Lean
Git submodules let us split large datasets across repos, so no single repo gets bloated.
Folder Structure + Filename Convention
/open-civic-data-blockchain/
├── country:us/ # United States
│ ├── state:il/ # Illinois state
│ │ ├── sessions/ # Legislative sessions
│ │ │ ├── ocd-session/country:us/state:il/2023-2024/ # Full OCD session ID
│ │ │ │ ├── bills/ # Bills in this session
│ │ │ │ │ ├── sb1234/ # Senate Bill 1234
│ │ │ │ │ │ ├── logs/ # Event logs folder
│ │ │ │ │ │ │ ├── 20240115T123045Z_session_bill_created.json # Initial bill creation in session
│ │ │ │ │ │ │ ├── 20240115T123045Z_metadata_created.json # Initial metadata creation
│ │ │ │ │ │ │ ├── 20240117T143022Z_metadata_updated.json # Metadata update with field mask
│ │ │ │ │ │ │ ├── 20240117T143156Z_sponsor_added.json # Sponsors added
│ │ │ │ │ │ │ ├── 20240120T092133Z_version_added.json # Version document added
│ │ │ │ │ │ │ ├── 20240130T152247Z_action_added.json # Action recorded
│ │ │ │ │ │ │ ├── 20240215T103045Z_doc_added.json # Supporting document added
│ │ │ │ │ │ │ ├── 20240315T140011Z_vote_initiated.json # Vote started
│ │ │ │ │ │ │ ├── 20240315T143022Z_vote_updated.json # Vote partial results
│ │ │ │ │ │ │ └── 20240315T150537Z_vote_finalized.json # Vote complete
│ │ │ │ │ │ └── files/ # Raw file storage
│ │ │ │ │ │ ├── bill_introduced.pdf # Original version document
│ │ │ │ │ │ ├── bill_amended.pdf # Amended version document
│ │ │ │ │ │ └── fiscal_note.pdf # Supporting document
│ │ │ │ │ ├── hb0789/ # House Bill 789
│ │ │ │ │ │ ├── logs/ # Event logs folder
│ │ │ │ │ │ │ ├── 20240118T090023Z_session_bill_created.json # Initial bill creation in session
│ │ │ │ │ │ │ ├── 20240118T090023Z_metadata_created.json # Initial metadata creation
│ │ │ │ │ │ │ └── ...
│ │ │ │ │ │ └── files/ # Raw file storage
│ │ │ │ │ │ └── ...
│ │ │ │ │ └── ...
│ │ │ │ └── events/ # Events for this session
│ │ │ │ ├── 2024-04-15-senate-appropriations-hearing.json # Senate committee hearing
│ │ │ │ ├── 2024-02-22-house-floor-session.json # House floor session
│ │ │ │ └── ...
│ │ │ ├── ocd-session/country:us/state:il/2021-2022/ # Previous session
│ │ │ │ └── ...
│ │ │ └── ...
│ │ └── events/ # Events not tied to a specific session
│ │ ├── 2024-07-15-joint-commission-meeting.json # Joint commission meeting
│ │ ├── 2024-08-20-special-task-force.json # Special task force meeting
│ │ └── ...
│ ├── state:ca/ # California state
│ │ └── ...
│ └── state:ny/ # New York state
│ └── ...
└── country:ca/ # Canada
└── ...
Git Architecture
We plan to auto-generate many git repos.
Session Git Repo
This repo should be a blockchain-like append only log, making syncing data as easy as git pull.
Question: what about the files like PDFS? They feel right to keep in here as a copy, but also, would balloon the size of these. Maybe yet another submodule for session files?
/
├── README.md # Session-specific information
├── bills/ # Bills in this session
│ ├── sb1234/ # Senate Bill 1234
│ │ ├── logs/ # Event logs folder
│ │ │ ├── 20240115T123045Z_session_bill_created.json
│ │ │ ├── 20240115T123045Z_metadata_created.json
│ │ │ ├── 20240117T143022Z_metadata_updated.json
│ │ │ └── ...
│ │ └── files/ # Raw file storage
│ │ ├── bill_introduced.pdf
│ │ ├── bill_amended.pdf
│ │ └── fiscal_note.pdf
│ ├── hb0789/ # House Bill 789
│ │ ├── logs/
│ │ │ └── ...
│ │ └── files/
│ │ └── ...
│ └── ...
└── events/ # Events for this session
├── 2024-04-15-senate-appropriations-hearing.json
├── 2024-02-22-house-floor-session.json
└── ...
Locale Git Repo
Overall locale repo (also generated). Contain links to git submodules that have event logs for different sessions/events. Will also contain scripts to rebuild data into Open Civic Data formats.
ocd-blockchain-illinois/
├── .gitmodules
├── README.md
├── scripts/
│ ├── scrape.py # Shortcut to directly scrape for this locale
| └── rebuild.py # To rebuild OCD data from blockchain logs
├── sessions/
│ ├── ocd-blockchain-illinois/ocd-session/country:us/state:il/2023-2024/
│ ├── ocd-blockchain-illinois/ocd-session/country:us/state:il/2021-2022/
│ └── ocd-blockchain-illinois/ocd-session/country:us/state:il/2019-2020/
└── events/
├── 2022-2026/
├── 2018-2022/
└── 2014-2018/
Main Repo
The primary repo (also generated) that people can clone to get all civic data easily via the submodules.
open-civic-data-blockchain/
├── .gitmodules
├── README.md
├── scripts/
│ ├── update_all.sh
│ ├── integrity_check.py
│ └── generate_cross_jurisdictional_report.py
└── jurisdictions/
├── country:us/
│ ├── state:il/ # Illinois submodule
│ ├── state:ca/ # California submodule
│ ├── state:ny/ # New York submodule
│ ├── district:dc/ # Washington DC submodule
│ ├── county:us/state:va/fairfax/ # Fairfax County submodule
│ └── place:us/state:tx/austin/ # City of Austin submodule
├── country:ca/
│ ├── province:on/ # Ontario province submodule
│ └── province:bc/ # British Columbia submodule
└── country:uk/
├── england/ # England submodule
└── scotland/ # Scotland submodule
TODO List
- Timestamps: Scrape-Oriented vs. Gov-Oriented
Are log timestamps the time we scraped the data, or the time of the actual government update?
What if a specific event doesn’t have a timestamp?
➤ Open Civic Data also discussed this - Unique IDs
OpenStates uses a lot of generated UUIDs. Ideally, our folder/file structure and naming conventions should follow official legislative data.- Jurisdiction ID: Follows OCD naming convention —
country:us/state:fl/government - Session ID: TODO
- Bill ID:
jurisdiction_id/sessions/:session_id/bill.identifier— use official ID likeHB250 - Vote Event ID: TODO
- Person ID: TODO
- Event ID: TODO
- Jurisdiction ID: Follows OCD naming convention —
- Bill Folder + Filename Convention
bill.metadata:bill_id/log/metadata_update_{TODO}.jsonbill.actions:bill_id/log/action_{TODO}.jsonbill.votes:bill_id/log/vote_{TODO}.jsonbill.sponsors:bill_id/log/sponsor_update_{TODO}.jsonbill.versions:- File:
bill_id/files/version_{TODO}.pdf - Log:
bill_id/log/version_add_{TODO}.json(we can extract PDF content to JSON)
- File:
bill.documents:- File:
bill_id/files/documents_{TODO}.pdf - Log:
bill_id/log/document_add_{TODO}.json(we can extract PDF content to JSON)
- File:
- Event Folder Convention
Events tied to sessions should live inside the session folder.
Out-of-session events: can we define a reliable alternate time span for organization? - How to Handle Metadata Changes
Metadata (likebill) may change from scrape to scrape.
UsefieldMaskfor lightweight updates, or consider JSON Patch.
➤ https://jsonpatch.com// bill.metadata_events { "fieldMask": ["from_organization"], "bill": { "from_organization": "" } }
Environment Setup
For now, we aren’t doing any coding that touches the previous code. All code/decisions should be in this scraper_next folder as an isolated experiment. If you don’t have git access, message @sartaj.
Easy: Download Data and Explore With SQL Explorers
- OpenState Illinois Scraper Output Files
- State/Federal OpenStates Data Explorer
- password is ChiHackNight closing group phrase all lowercase
- Chicago OCD Data Explorer Explore Councilmatic PG Dump for Chicago OCD data
Advanced: Running Scrapers / Importing PG Dumps
- Open States
- via Scraper. We are using this for v1. By running the scrapers directly, data will be much more up to date as it scrapes data directly. It also allow us to run certain scrapers, like USA, multiple times a day.
- via SQL Dump, which updates every few days, and has bill full text, in addition to a lot of other content like maps data.
- Chicago SQL Dump. This updates every night and is managed by Datamade, who we have already been collaborating with on Chicago data. They also do stuff like AI summaries that we can pre-pull.
Prior Art
- Washington DC made Github their official law source of truth. It looks immutable.
- How append-only logs are used in p2p/blockchain applications.
- Beginners guide to event sourced databases and their benefits.
- Bluesky LGBTQ+ Legislation Alerts This incredible team has manually created a system that I think we can make tooling for that they would potentially want to use.
Communications
- Discussion via Slack
- Task Board via Slack
- (this file) Collaborative Brainstorming via Git: Feel free to edit.
Bill Bot Designer
Overview
The Bill Bot Designer is a tool for creating automated bots that monitor and publish legislative updates from the Open Civic Data Blockchain. These bots can be configured to watch specific bills, jurisdictions, or events and automatically post updates to various platforms like Bluesky, Twitter, RSS feeds, or custom webhooks.
Why Bots?
- 🤖 Automated Monitoring: Bots can continuously watch for legislative changes without human intervention
- 📢 Multi-Platform Publishing: Single bot configuration can publish to multiple platforms simultaneously
- 🎯 Targeted Alerts: Organizations can create highly specific feeds for their constituents
- ⚡ Real-Time Updates: Instant notifications when important legislative events occur
- 🔄 Consistent Formatting: Standardized message formats across all platforms
Bot Architecture
Event-Driven Design
Bots operate on an event-driven architecture, listening to the append-only log of legislative events:
Legislative Event → Blockchain Log → Bot Filter → Message Generation → Platform Publishing
Bot Components
- Event Listener: Monitors the blockchain log for new events
- Filter Engine: Applies rules to determine if an event should trigger the bot
- Message Generator: Creates platform-specific messages from event data
- Publisher: Sends messages to configured platforms
- Rate Limiter: Ensures compliance with platform API limits
Configuration Examples
Basic Bill Monitor Bot
name: "Illinois Bill Monitor"
description: "Monitors all Illinois bills for key actions"
# Event filtering
filters:
- jurisdiction: "country:us/state:il"
- event_types: ["bill_introduced", "bill_passed", "bill_vetoed"]
- keywords: ["environment", "education", "healthcare"]
# Message template
message_template: |
📋 {bill.identifier}: {bill.title}
🏛️ {action.description}
📅 {action.date}
🔗 {bill.url}
# Publishing platforms
platforms:
- type: "bluesky"
account: "@legislative-alerts.bsky.social"
rate_limit: "10/hour"
- type: "rss"
feed_url: "https://example.com/il-bills.xml"
update_frequency: "immediate"
Specialized Committee Bot
name: "Senate Appropriations Monitor"
description: "Tracks all bills going through Senate Appropriations"
filters:
- jurisdiction: "country:us/state:il"
- committee: "Senate Appropriations"
- event_types: ["bill_referred", "bill_hearing_scheduled", "bill_vote"]
message_template: |
💰 Senate Appropriations Update
📋 {bill.identifier}: {bill.title}
📊 Fiscal Impact: {bill.fiscal_note.summary}
📅 Next Action: {next_action.description}
🗓️ Date: {next_action.date}
platforms:
- type: "webhook"
url: "https://api.example.com/appropriations-webhook"
headers:
Authorization: "Bearer {webhook_token}"
- type: "email"
recipients: ["budget@example.org", "finance@example.org"]
subject: "Senate Appropriations Alert: {bill.identifier}"
Constituent Alert Bot
name: "District 5 Constituent Alerts"
description: "Alerts constituents about bills affecting their district"
filters:
- jurisdiction: "country:us/state:il"
- sponsor_district: "5"
- event_types: ["bill_introduced", "bill_passed", "bill_signed"]
message_template: |
🏠 District 5 Update
📋 {bill.identifier}: {bill.title}
👤 Sponsored by: {sponsor.name}
📝 Summary: {bill.summary}
📅 Status: {bill.status}
🔗 Learn more: {bill.url}
platforms:
- type: "sms"
phone_numbers: ["+15551234567", "+15559876543"]
provider: "twilio"
- type: "slack"
channel: "#district-5-alerts"
workspace: "example-org"
Platform Integrations
Bluesky
- Rate Limit: 10 posts per hour
- Character Limit: 300 characters
- Features: Rich text, links, images
- Authentication: App password required
Twitter/X
- Rate Limit: 300 tweets per 3 hours
- Character Limit: 280 characters
- Features: Text, images, polls
- Authentication: OAuth 2.0
RSS Feeds
- Format: RSS 2.0 or Atom
- Update Frequency: Configurable
- Features: Full text, categories, enclosures
- Hosting: GitHub Pages, custom server
Webhooks
- Method: POST
- Content-Type: application/json
- Authentication: Bearer token or API key
- Retry Logic: Exponential backoff
- Providers: SMTP, SendGrid, Mailgun
- Templates: HTML and plain text
- Attachments: PDF bills, documents
- Rate Limits: Varies by provider
SMS
- Providers: Twilio, AWS SNS
- Character Limit: 160 characters
- Features: Text only
- Cost: Per message
Advanced Features
Conditional Logic
filters:
- jurisdiction: "country:us/state:il"
- conditions:
- if: "bill.fiscal_impact > 1000000"
then: "priority = high"
- if: "bill.sponsor.party == 'Republican'"
then: "include_opposition_analysis = true"
Message Templates with Variables
message_template: |
{#if bill.fiscal_impact > 1000000}💰 HIGH COST BILL {/if}
📋 {bill.identifier}: {bill.title}
👤 Sponsor: {bill.sponsors[0].name} ({bill.sponsors[0].party})
📊 Fiscal Impact: ${bill.fiscal_impact:,.0f}
📅 {action.date | date_format: "%B %d, %Y"}
🔗 {bill.url}
{#if bill.summary}
📝 {bill.summary | truncate: 200}
{/if}
Scheduled Publishing
publishing:
schedule:
- time: "09:00"
timezone: "America/Chicago"
days: ["monday", "tuesday", "wednesday", "thursday", "friday"]
- time: "17:00"
timezone: "America/Chicago"
days: ["monday", "tuesday", "wednesday", "thursday", "friday"]
batch_size: 5
delay_between_posts: "30s"
Analytics and Monitoring
analytics:
track_engagement: true
platforms:
- bluesky
- twitter
- webhook
metrics:
- posts_sent
- engagement_rate
- error_rate
- response_time
alerts:
- condition: "error_rate > 0.05"
action: "email_admin"
- condition: "no_posts_24h"
action: "slack_alert"
Best Practices
Content Guidelines
- Be Accurate: Always verify data before publishing
- Stay Neutral: Present information without bias
- Include Context: Provide background information when relevant
- Use Clear Language: Avoid jargon and technical terms
- Include Sources: Always link to official sources
Technical Guidelines
- Rate Limiting: Respect platform API limits
- Error Handling: Implement retry logic and fallbacks
- Monitoring: Track bot performance and errors
- Testing: Test configurations before going live
- Documentation: Document bot purposes and configurations
Legal Considerations
- Copyright: Respect copyright on bill text and documents
- Attribution: Always credit original sources
- Disclaimers: Include appropriate disclaimers
- Compliance: Follow platform terms of service
- Privacy: Don’t collect or store personal information
Getting Started
1. Choose Your Use Case
- General Monitoring: Track all bills in a jurisdiction
- Committee Focus: Monitor specific committees
- Issue-Based: Track bills by topic or keywords
- Constituent Service: Alert constituents about relevant bills
2. Design Your Filters
- Jurisdiction: Which government body to monitor
- Event Types: What actions to track
- Keywords: Specific topics or terms
- Sponsors: Bills from specific legislators
3. Create Your Message Template
- Platform Limits: Consider character limits
- Required Information: Bill ID, title, action, date
- Optional Details: Sponsor, summary, fiscal impact
- Call to Action: Links to learn more or take action
4. Configure Platforms
- Primary Platform: Choose your main publishing platform
- Secondary Platforms: Add additional platforms for reach
- Testing: Test with a small audience first
- Monitoring: Set up alerts and analytics
5. Deploy and Monitor
- Gradual Rollout: Start with limited scope
- Monitor Performance: Track engagement and errors
- Iterate: Refine based on feedback and data
- Scale: Expand to additional jurisdictions or topics
Examples in Action
Bluesky Legislative Alerts
The Bluesky LGBTQ+ Legislation Alerts bot demonstrates how effective automated legislative monitoring can be. It:
- Monitors bills across multiple states
- Filters for LGBTQ+ related legislation
- Posts concise, informative updates
- Builds a community around legislative transparency
Chicago Councilmatic
The Chicago Councilmatic system shows how bots can enhance existing civic data platforms:
- Integrates with existing Open Civic Data sources
- Provides real-time updates on city council activities
- Maintains historical records of all legislative actions
- Enables custom feeds for different stakeholders
Future Enhancements
AI-Powered Features
- Smart Summaries: AI-generated bill summaries
- Impact Analysis: Automated analysis of bill effects
- Sentiment Analysis: Track public opinion on bills
- Predictive Modeling: Forecast bill outcomes
Advanced Integrations
- Calendar Integration: Add events to personal calendars
- CRM Integration: Track constituent interactions
- Newsletter Integration: Compile weekly summaries
- API Access: Allow third-party integrations
Enhanced Analytics
- Engagement Tracking: Measure bot effectiveness
- A/B Testing: Test different message formats
- Audience Insights: Understand who’s following bots
- Performance Optimization: Improve delivery rates
Windy Civi
A unified portal with notifications for Chicago residents, showing local, state, and federal bills with AI summaries and topics, allowing users to get notifications.
Contributors
- Andrew Dauphinais
- Emme
- Kari Douglass
- Marissa Heffler
- Sartaj Chowdhury
- Zach Schoneman
Additional Contributors
- haileyplusplus
- Fiona Tang
- Miroslava Osorio
- Nate Johnson
December 2024 Presentation
The following is our presentation from December 2024
Slides


































Civi Social + MyChicago
Allow residents of Chicago to directly interact with their elected officials.
Contributors
Additional Contributors
- Charles Cole
- Sue Kwong
History



Socratic.Center
Easily find who represents you.
This era involved making www.socratic.center, a site to easily find your representative. From here, this project merged into Chi Hack Night as a breakout group.
Contributors
History


