$ man clay-wiki/data-storage-batching
Core Conceptsadvanced
Data Storage, Batching, and Rate-Limit-Safe Enrichment
Clay is orchestration, not storage — use Supabase for scale
Clay Is Orchestration, Not Storage
Clay is brilliant at enrichment, qualification, and routing. It is not a database. If your TAM is 50K+ accounts, don't try to keep everything in Clay tables. Clay slows down at scale. Row limits exist. The UI gets laggy. You lose the ability to query your data with SQL. The answer: connect Supabase or Postgres as your storage layer. Clay enriches and scores. Supabase stores. You get the best of both — Clay's enrichment engine with a real database's querying power.
PATTERN
The Batch-and-Store Pattern
For large TAMs (5,000+ accounts): (1) Load a batch of 1,000-5,000 accounts into Clay. (2) Run your full enrichment flow — firmographics, ICP scoring, tech stack, MX records. (3) Push enriched records to Supabase via HTTP column or webhook. (4) Clear the Clay table. (5) Load the next batch. (6) Repeat until done. (7) Query the full enriched TAM in Supabase.
This keeps your Clay table fast, avoids row limits, and gives you SQL-level querying over your entire dataset. The Clay table is a processing queue, not a warehouse.
PATTERN
Rate-Limit-Safe Enrichment
Every enrichment provider has rate limits. If you blast 5,000 rows through Apollo simultaneously, you'll get throttled. The safe approach: (1) Set Clay's run speed to match the provider's rate limit (check their docs). (2) Batch your runs — don't run enrichment on 5,000 rows at once. Run 500, wait, run 500 more. (3) Use Clay's built-in delay between rows when available. (4) For HTTP columns hitting external APIs, add explicit backoff logic. (5) Monitor the first 50 rows — if errors start appearing, you're being rate-limited. Slow down before you burn credits on failed requests.
The goal: steady throughput without triggering provider-side throttling. Patience beats speed when credits are on the line.
PRO TIP
Supabase Integration
The simplest Supabase integration uses Clay's HTTP column to POST enriched rows to your Supabase REST API. Set up a Supabase table that mirrors your Clay columns. Add an HTTP column in Clay that fires on enrichment completion — when all data columns are populated, POST the row to Supabase. Include an upsert key (domain for accounts, email for contacts) so re-runs don't create duplicates.
For more advanced setups, use Supabase Edge Functions as middleware between Clay and your database. The edge function can validate data, handle conflicts, and trigger downstream workflows (Slack alerts, CRM syncs) that Clay can't.
PATTERN
Batch Size Rules of Thumb
Batch sizes depend on your enrichment complexity: Simple enrichment (1-2 providers, no AI): 5,000 rows per batch. Standard enrichment (waterfall + scoring): 1,000-2,000 rows per batch. Heavy enrichment (Claygent + HTTP columns + waterfall): 500 rows per batch. Enterprise accounts (deep enrichment + multiple personas): 200-500 rows per batch.
Smaller batches = more control, easier debugging, less credit waste on errors. Larger batches = faster throughput but higher risk if something breaks. Start small, increase once you trust the flow.
ANTI-PATTERN
Anti-Pattern: Everything in One Table
Don't try to build your entire TAM in a single Clay table. I've seen people with 20,000-row tables that take 30 seconds to load, where enrichment columns time out because they're fighting for resources with 40 other columns. Split by phase: sourcing table → enrichment table → scoring table → output table. Or split by batch: batch 1 table, batch 2 table. Each table should be focused and fast. If your table has more than 5,000 rows or more than 30 columns, it's time to split.
related entries