Unbundling of the Cloud Data Warehouse: Open Source Databases and Data Lakes

In the era of proprietary cloud data warehouses like Snowflake, BigQuery, and Redshift, data teams achieved scalability and convenience but at the cost of performance bottlenecks, soaring expenses, and vendor lock-in. As workloads expanded to include user-facing analytics, observability, and machine learning, the limitations of traditional monolithic architectures became apparent. This talk examines the unbundling of the cloud data warehouse and the emergence of a modern data stack driven by open source technologies. We will discuss how databases like Postgres and ClickHouse, combined with open data lake standards such as Iceberg, Delta Lake, and Hudi, enable flexible, cost-effective, and high-performance solutions. By replacing traditional warehouses with composable and open architectures, organizations can build systems optimized for their specific real-time analytics and data processing needs.