ExpandCast
One podcast episode, a week of derivative content. Tenant isolation by Postgres RLS.

- Outputs
- Blog · threads · newsletter · clips · thumbnails
- Time to derivatives
- ~2 minutes
- Tenant isolation
- Postgres RLS
- Stack
- pgvector · Groq · AI SDK v6
Overview
Independent podcasters lose the value of an episode the day after they publish it. Blog posts, threads, newsletters, clips and thumbnails are what actually feed the funnel, and each one takes hours of manual work. ExpandCast turns one uploaded episode into all five formats in about two minutes. The output sounds like the creator because it's RAG-grounded on the creator's own transcript, never another tenant's.
It's built so a solo creator can hold a derivative-content calendar without an editor or assistant. Multi-tenancy lives in the database, not the route handler. RLS is the only layer that can't be forgotten.
“Tenant isolation lives in the database, not the code. Forgetting a WHERE clause should never leak another tenant's transcript.”
Architecture
Steps 01–04 run once per episode on upload. That's the ingest. Steps 05–06 fire on demand, RAG against the tenant's own chunks and voice profile, streamed from Groq. Five derivatives in ~2 minutes.

Steps 01–04 run once per episode on upload. That's the ingest. Steps 05–06 fire on demand, RAG against the tenant's own chunks and voice profile, streamed from Groq. Five derivatives in ~2 minutes.
Ingest is one-shot per episode. Consumption is RAG on demand against the tenant's own vectors.Key features
- /01
Per-tenant by RLS, not by code
Tenant isolation lives in Postgres policies. Every read or write is checked at the row level. A leak requires breaking RLS, not forgetting a WHERE clause.
- /02
Voice-matched RAG, not generic AI copy
Generation runs against the tenant's own pgvector chunks, conditioned on a per-tenant voice profile. The output sounds like the creator, not like an LLM. And it can't quote an episode that doesn't belong to whoever's asking.
- /03
Streamed from Groq, not polled
AI SDK v6 streams tokens from Groq to the client. Show notes feel instant because the user reads while the model is still writing.
- /04
Service-role kept off the browser
The admin client lives strictly behind the server boundary. Anything the browser can call hits the RLS-bound anon client. The privileged surface never gets exposed.
Technical decisions
What I'd do differently
- /01
Build a re-embed migration path on day one. When the embedding model changes (and it will), existing tenants stay frozen on the old vectors. A background cron that re-embeds tenant by tenant would have saved a manual ops pass on every upgrade.
- /02
Materialize derivative outputs keyed on (episode_id, derivative_type, model_version). Right now every 'blog post' or 'thread' request hits Groq even when the user generated it last week. Caching the materialized output cuts credit burn ~3–5x on revisits, and putting model_version in the key makes invalidation explicit.

