/expandcastLive

ExpandCast

One podcast episode, a week of derivative content. Tenant isolation by Postgres RLS.

Role

Solo founder · Lead engineer

Year

2026

Status

Live

View live ↗

expandcast.com

The AI Content Multiplier is Live

At a glance

Next.js 16SupabasepgvectorAI SDK v6GroqStripe

Outputs: Blog · threads · newsletter · clips · thumbnails
Time to derivatives: ~2 minutes
Tenant isolation: Postgres RLS
Stack: pgvector · Groq · AI SDK v6

01 — Overview

Overview

Independent podcasters lose the value of an episode the day after they publish it. Blog posts, threads, newsletters, clips and thumbnails are what actually feed the funnel, and each one takes hours of manual work. ExpandCast turns one uploaded episode into all five formats in about two minutes. The output sounds like the creator because it's RAG-grounded on the creator's own transcript, never another tenant's.

It's built so a solo creator can hold a derivative-content calendar without an editor or assistant. Multi-tenancy lives in the database, not the route handler. RLS is the only layer that can't be forgotten.

“Tenant isolation lives in the database, not the code. Forgetting a WHERE clause should never leak another tenant's transcript.”

02 — Architecture

Architecture

Steps 01–04 run once per episode on upload. That's the ingest. Steps 05–06 fire on demand, RAG against the tenant's own chunks and voice profile, streamed from Groq. Five derivatives in ~2 minutes.

Schematic of the ingest pipeline (lime, top) and the on-demand consumption layer (violet, bottom), separated by the embed-ready boundary.

ingest (once per episode)

01 — Uploadepisode.mp3 → Supabase Storage (RLS-scoped)

02 — TranscribeGroq Whisper-large + speaker diarization

03 — Chunk + embedsplitter → pgvector embeddings

04 — Persistsupabase.episodes + chunks (per-tenant RLS)

embed → ready

consumption (on-demand)

05 — Retrievepgvector cosine · tenant filter (RLS auto)

06 — Generate + streamAI SDK v6 · Groq · blog · threads · newsletter · clips · thumbnails

Ingest is one-shot per episode. Consumption is RAG on demand against the tenant's own vectors.

03 — Key features

Key features

/01
Per-tenant by RLS, not by code
Tenant isolation lives in Postgres policies. Every read or write is checked at the row level. A leak requires breaking RLS, not forgetting a WHERE clause.
/02
Voice-matched RAG, not generic AI copy
Generation runs against the tenant's own pgvector chunks, conditioned on a per-tenant voice profile. The output sounds like the creator, not like an LLM. And it can't quote an episode that doesn't belong to whoever's asking.
/03
Streamed from Groq, not polled
AI SDK v6 streams tokens from Groq to the client. Show notes feel instant because the user reads while the model is still writing.
/04
Service-role kept off the browser
The admin client lives strictly behind the server boundary. Anything the browser can call hits the RLS-bound anon client. The privileged surface never gets exposed.

04 — Technical decisions

Technical decisions

05 — What I'd do differently

What I'd do differently

/01
Build a re-embed migration path on day one. When the embedding model changes (and it will), existing tenants stay frozen on the old vectors. A background cron that re-embeds tenant by tenant would have saved a manual ops pass on every upgrade.
/02
Materialize derivative outputs keyed on (episode_id, derivative_type, model_version). Right now every 'blog post' or 'thread' request hits Groq even when the user generated it last week. Caching the materialized output cuts credit burn ~3–5x on revisits, and putting model_version in the key makes invalidation explicit.