Due to the recent outbreak of COVID-19, O’Reilly Strata Data & AI 2020 in San Jose, which was scheduled for March 15–18, will be merged with Strata NY in September.
If you have an investment in Spark infrastructure and you’re heavily using SparkSQL for various workloads and you’re ready to take on a cutting-edge new SQL engine that runs on GPUs for at least one order of magnitude increase in performance with minimum effort, you’re about to find out how. Blueprint’s Director of Engineering, Claudiu Barbura, explains how to shield your existing consumer applications built on Spark (reporting from Tableau and PowerBI, data science workloads in notebooks, etc.) from any replacement or enhancement of your engine under the hood.
You’ll discover the lessons Blueprint learned with Spark (CPU), BlazingSQL and Rapids.ai (GPU), and Apache Arrow in its quest to exponentially increase the performance of its data virtualizer that enables real-time access to disparate data sources across different cloud providers and on-premises databases and APIs when a native query translation to the data source isn’t possible. You’ll also learn how you can leverage the performance of this GPU-based SQL engine’s performance (BlazingSQL) in your favorite tools via a unified interface, especially if you’re a BI analyst or data scientist.