In a captivating talk at the recent Data Engineering Summit, Sudarshan Pakrashi, the Director of Data Engineering at Zeotap, addressed the elephant in the room for many data-driven organizations: the rising costs of managing and processing ever-expanding data sets. His solution was both simple and profound, changing the perspective on how we handle voluminous data.
Painting the Data Landscape
Pakrashi painted a vivid picture of the challenge, demonstrating the magnitude with a relatable example. Suppose an organization like Zeotap tracks a million users' impressions across 100,000 ads daily, with each user assigned a unique 64-gig hash key. The amassed data would be around 160 GB daily, ballooning to 50 terabytes monthly.
This scenario represented a single use case, with actual si
From Precision to Efficiency: A New Perspective on Data Engineering
- By 理想
- Published on
At the recent Data Engineering Summit, Sudarshan Pakrashi advocated for a groundbreaking approach to managing big data. His proposal of using probabilistic data structures could redefine industry norms, adding a fresh perspective to the discourse on data processing and storage.
