DATABRICKS CENTER OF EXCELLENCE
Hadoop-to-Databricks Migration QuickStart
- Accelerate high-value analytics initiatives and gain experience with Databricks.
- Get up and running on Databricks in less than two weeks.
- Complete one (1) Hadoop workload migration and analyze with SQL analytics and BI tools.
- Blueprint accelerators speed time to value and actionable insights.
2 weeks (10 days)
Pricing starting at:
a data team to analytics on Databricks
Databricks & data services, non-prod env
1 Hadoop workload migration
with SQL Analytics and BI tools
Major U.S. oil drilling company
- Siloed data in disparate systems and Hadoop environment — not highly performant
- Drill bit sensors collected data every second, but delivered to data scientists only once every 24 hours (not quick enough for BI to inform decisions)
- Timing delays in data availability resulted in slow response times and drilling adjustments (inefficient)
- Stream siloed data (rig, sensor, oil sample, financial, HR, marketing) into Azure Databricks Data Lake
- Process, normalize, and organize data into tables
- Remove need for third party legacy engineering tools to structure raw data
- Data modeling via Azure SQL & Azure Data Factory
- Share data to Power BI for real-time analytics, dashboards, and reports
Data migrated to Data Lake
Increase in rig state data processing
(from 24 to every 4 hours)
increase in speed of OFT data processing
(45 days of OFT data processed in 1 hour instead of 24 hours)
Real-time drilling data available in 1 second!
- Data acquisition
- Simple data transformations
- Organizing data
- BI, reporting
- Optimizing costs & management
Up & running
- Implementation → Powered by Infra-as-Code
- Security Config → Blueprint security rapid config
- Lakehouse optimization → Blueprint Lakehouse Optimizer
- Identify data sources
- Historical & current data
- Data transformations
- Data quality
- Data set creation, scheduled
- Tables in the Lakehouse, ready!
BI & analysis, optimize, & roadmap
- Power BI or Tableau
- DBSQL for ad hoc analysis
- Dashboards & reports
- Utilization management
- Roadmap the future
AWS or Azure readiness
QuickStart data sources
Sample data & notebooks
Databricks is LIVE!
Build a net-new use case and validate
cost vs performance and usability of Databricks platform.
- TCO and performance projection report
- Established Databricks Lakehouse environment
- Data ingestion pipeline
- Platform utilization monitoring app
- Working end-to-end use case/business process
deployed to non-production environment
- Results report and demonstration
Databricks Center of Excellence