Docs
dbt integration.
The dbt model generator turns observed payload shapes into typed staging models. Here's how it works, how to regenerate without losing edits, and how to make it part of your CI.
How generation works
A schema-observer Worker walks each sampled webhook payload (default 1 in 10) and records, for every JSON path it sees: the observed types, total counts, null counts, sample values, first-seen, and last-seen timestamps. The observations live in D1 in the schema_observations table.
When you hit Generate in the hub, we:
- Filter to fields seen in ≥X% of events (default 1%)
- Walk the tree and assign a SQL type per path based on observed types
- Render
stg_<endpoint>.sqland_hooktopus__sources.yml - Bundle them into a ZIP and stream it back
Type extraction rules
- Always numeric →
SAFE_CAST(JSON_VALUE(...) AS INT64)orFLOAT64 - Always string →
JSON_VALUE(...) - Mixed types →
JSON_VALUE(...)with explanatory comment - Object →
JSON_QUERY(...)(returns the sub-tree) - Array →
JSON_QUERY(...)(you unnest in downstream models)
Where to put the files
The ZIP contains exactly two files. Drop them into your dbt project at:
my-dbt-project/
└── models/
└── staging/
└── hooktopus/
├── stg_stripe.sql
└── _hooktopus__sources.yml
What the source file looks like
Hooktopus generates a sources declaration with a freshness check so dbt warns you when events stop arriving. Replace your-project and your-dataset as needed — the generator pre-fills the destination you picked.
version: 2
sources:
- name: hooktopus
database: your-project
schema: your_dataset
tables:
- name: events
identifier: events
loaded_at_field: received_at
freshness:
warn_after: { count: 1, period: hour }
error_after: { count: 6, period: hour }
columns:
- name: event_id
tests:
- not_null
- unique
Regenerating without losing your edits
The hard problem: you edited stg_stripe.sql to add a derived column, and now Stripe added a field and you want to regenerate. Two patterns work:
Pattern A — Override layer (recommended)
Keep stg_stripe.sql regen-clean. Put your custom columns in a layer above it, like stg_stripe_enriched.sql. Regenerate freely; your derived layer is untouched.
{{ config(materialized='view') }}
select
*,
amount_cents / 100.0 as amount_dollars,
case when event_type like 'charge.%' then 'charge' else 'other' end as category
from {{ ref('stg_stripe') }}
Pattern B — Pin and own
Stop regenerating after the first version. Edit stg_stripe.sql freely. Hooky still detects drift and alerts you; you handle the SQL changes manually. Best when your team has strong dbt opinions you don't want overridden.
Wiring into CI
The freshness check we generate fires a dbt source freshness warning if no events arrived in the last hour. Add this to your CI to catch silent endpoint failures:
# In your scheduled CI run, before `dbt run`:
dbt source freshness --select source:hooktopus
Testing the generated model
# Confirm it builds at all
dbt run --select stg_stripe
# Confirm the freshness + tests pass
dbt test --select stg_stripe
# Inspect typed columns
dbt show --select stg_stripe --limit 10