Discussion about this post

User's avatar
Neural Foundry's avatar

Fantastic writeup on tackling throughput bottlenecks at petabyte scale. The batch size lever was interesting especially for backfills where 1TB batches basically eliminated downstream OPTIMIZE work. I've seen simlar patterns where eager clustering trades write amplification during ingestion for faster read paths, but it only works when batches are large enough. Predictive Optimization with fallback manual runs is smart.

No posts

Ready for more?