Apache Druid was invented to address the lack of a data store optimized for real-time analytics. Druid combines the best of real-time streaming analytics and multidimensional OLAP with the scale-out storage and computing principles of Hadoop to deliver ad hoc, search and time-based analytics against live data with sub-second end-to-end response times. Today, thousands of companies worldwide rely on Druid to provide real-time monitoring and analytics, including data-driven companies like Netflix, Lyft, Pinterest and Alibaba. The largest Druid deployments deliver sub-second analytics against millions of events per second and 100s of petabytes of data.

This whitepaper provides an introduction to Apache Druid, including its evolution, core architecture and features, and common use cases.