$ man clay-wiki/data-storage-batching

Core Conceptsadvanced

Data Storage, Batching, and Rate-Limit-Safe Enrichment

Clay is orchestration, not storage — use Supabase for scale

Clay Is Orchestration, Not Storage

Clay is brilliant at enrichment, qualification, and routing. It is not a database. If your TAM is 50K+ accounts, don't try to keep everything in Clay tables. Clay slows down at scale. Row limits exist. The UI gets laggy. You lose the ability to query your data with SQL. The answer: connect Supabase or Postgres as your storage layer. Clay enriches and scores. Supabase stores. You get the best of both — Clay's enrichment engine with a real database's querying power.

PATTERN

The Batch-and-Store Pattern

For large TAMs (5,000+ accounts): (1) Load a batch of 1,000-5,000 accounts into Clay. (2) Run your full enrichment flow — firmographics, ICP scoring, tech stack, MX records. (3) Push enriched records to Supabase via HTTP column or webhook. (4) Clear the Clay table. (5) Load the next batch. (6) Repeat until done. (7) Query the full enriched TAM in Supabase. This keeps your Clay table fast, avoids row limits, and gives you SQL-level querying over your entire dataset. The Clay table is a processing queue, not a warehouse.

PATTERN

Rate-Limit-Safe Enrichment

Every enrichment provider has rate limits. If you blast 5,000 rows through Apollo simultaneously, you'll get throttled. The safe approach: (1) Set Clay's run speed to match the provider's rate limit (check their docs). (2) Batch your runs — don't run enrichment on 5,000 rows at once. Run 500, wait, run 500 more. (3) Use Clay's built-in delay between rows when available. (4) For HTTP columns hitting external APIs, add explicit backoff logic. (5) Monitor the first 50 rows — if errors start appearing, you're being rate-limited. Slow down before you burn credits on failed requests. The goal: steady throughput without triggering provider-side throttling. Patience beats speed when credits are on the line.

PRO TIP

Supabase Integration

The simplest Supabase integration uses Clay's HTTP column to POST enriched rows to your Supabase REST API. Set up a Supabase table that mirrors your Clay columns. Add an HTTP column in Clay that fires on enrichment completion — when all data columns are populated, POST the row to Supabase. Include an upsert key (domain for accounts, email for contacts) so re-runs don't create duplicates. For more advanced setups, use Supabase Edge Functions as middleware between Clay and your database. The edge function can validate data, handle conflicts, and trigger downstream workflows (Slack alerts, CRM syncs) that Clay can't.

PATTERN

Batch Size Rules of Thumb

Batch sizes depend on your enrichment complexity: Simple enrichment (1-2 providers, no AI): 5,000 rows per batch. Standard enrichment (waterfall + scoring): 1,000-2,000 rows per batch. Heavy enrichment (Claygent + HTTP columns + waterfall): 500 rows per batch. Enterprise accounts (deep enrichment + multiple personas): 200-500 rows per batch. Smaller batches = more control, easier debugging, less credit waste on errors. Larger batches = faster throughput but higher risk if something breaks. Start small, increase once you trust the flow.

ANTI-PATTERN

Anti-Pattern: Everything in One Table

Don't try to build your entire TAM in a single Clay table. I've seen people with 20,000-row tables that take 30 seconds to load, where enrichment columns time out because they're fighting for resources with 40 other columns. Split by phase: sourcing table → enrichment table → scoring table → output table. Or split by batch: batch 1 table, batch 2 table. Each table should be focused and fast. If your table has more than 5,000 rows or more than 30 columns, it's time to split.

← clay wiki content wiki →