$ man apollo-wiki/supabase-warehouse

Integrationsadvanced

Apollo to Supabase Data Warehouse

Schema design, upsert logic, and account-contact split for Apollo data

☾

Why Supabase

Apollo data needs to live somewhere durable. CSV exports get lost. Spreadsheets do not scale. CRMs have import limits and field constraints. Supabase gives you a free Postgres database with a REST API, real-time subscriptions, and row-level security. Your Apollo sourcing scripts write directly to Supabase. Your enrichment scripts read from and write back to Supabase. Your CRM sync reads from Supabase. It is the central data warehouse that connects everything. And the free tier handles 500MB of data and 2GB of bandwidth - more than enough for most GTM pipelines.

PATTERN

The Account-Contact Split

Two tables. accounts: domain (primary key), company_name, employee_count, industry, annual_revenue, tech_stack, icp_score, enriched_at. contacts: apollo_id (primary key), email, first_name, last_name, title, seniority, domain (foreign key to accounts), persona_tag, sourced_at, enriched_at. The account-contact split prevents duplicate company enrichment. 50 contacts at Microsoft = 1 account row enriched once, not 50 redundant company lookups. This is the same foundational pattern from Clay, applied to your own database.

CODE

Upsert Logic

Every Apollo sourcing run should upsert, not insert. On the accounts table, upsert on domain - if the domain exists, update the fields that changed. On the contacts table, upsert on apollo_id (or email if apollo_id is missing). This prevents duplicates across multiple sourcing runs. In Supabase, use supabase.from('contacts').upsert(data, { onConflict: 'apollo_id' }). In raw SQL:

INSERT INTO contacts (...) VALUES (...) ON CONFLICT (apollo_id) DO UPDATE SET title = EXCLUDED.title, enriched_at = EXCLUDED.enriched_at

. Fields like sourced_at should NOT be overwritten on upsert - you want to preserve when the contact was first sourced.

PRO TIP

Query Patterns

Once your data is in Supabase, you can run pipeline analytics that CRMs make difficult. Examples: "How many VP-level contacts have we sourced at companies with 50-200 employees in the last 30 days?"

SELECT COUNT(*) FROM contacts c JOIN accounts a ON c.domain = a.domain WHERE c.seniority = 'vp' AND a.employee_count BETWEEN 50 AND 200 AND c.sourced_at > NOW() - INTERVAL '30 days'

. Or: "What is our match rate by persona?" Group by persona_tag, count total vs enriched. These queries take seconds. In a CRM, you would be building custom reports and waiting.

← apollo wiki clay wiki →