Live Streaming Data in Need of Data Wrangling & Advanced Analytics
Regional Dates:
About this Event
In this exciting week-long hackathon hosted by Blueprint and Databricks, choose your own adventure by selecting 1 of 3 challenges that will push your skills with streaming data, large datasets and of course, Databricks!
The trend towards analyzing streaming data is becoming the default mode of data management practices with a growing number of use cases across industries. Making decisions in real time, identifying anomalous behaviors, and alerting on operational and life safety issues is largely dependent on data streams. Understanding these data patterns and the skills and tools necessary to bring these use-cases online is of increasing importance.
Many companies are planning to make significant upgrades to their data warehouse and data marts. Databricks has a prominent place in cloud-native, modern data architectures. By participating in this event, you will gain an understanding of the patterns and tools that support data engineering, data management, and advanced analytics relevant to streaming data.
Your week-long challenge is to build a working prototype for one of the three challenges below. What you choose to do, is up to you! The hackathon will conclude with teams presenting their ideas to a group of judges followed by an award ceremony.
We will have a team of our experts available for you to bounce ideas off of, and offer some moral support. The virtual event will be hosted on Microsoft Teams, allowing you to collaborate seamlessly with your team and connect with mentors when needed.
The Scenario
Blueprint has created a simulator that streams transactions from a fictitious outdoor-gear retail company to an event hub. The transactions contain enough information about a purchase to support several data-management, machine learning, and advanced BI use cases. While the data stream in this hackathon simulates a POS system, the challenges and solutions can be applied to any industry that deals with large amounts of streaming data (IoT sensors, video game telemetry, etc.).
Challenge 1
Teams will acquire streaming-data from an Azure event hub using Databricks. A wide range of data-management techniques can be explored by the hack-teams with the goal of creating consumer-ready datasets with Databricks Delta.
Challenge 2
Blueprint has purposefully created a series of distributions in the streaming data to simulate real-life examples of fraud or anomalous transaction occurrences. From theis data, the hack-teams will explore various machine-learning techniques to find and alert on fraudulent activity and anomalies. In addition, the hack-teams can opt to use other machine-learning algorithms to: forecast sales, predict inventory-stock-out, reveal clusters, and create recommender-engines.
Challenge 3
Databricks’ Delta Lake has unleashed a new data pattern in the Cloud. The so-called Lakehouse is gaining in popularity as data can be stored, updated, and deleted directly on the data lake, without the need to process data into a warehouse. For many use-cases, this represents the quickest path to value. As Delta Lake provides direct links to Enterprise-scale BI tools, like Microsoft’s Power BI, the hack-teams will explore advanced methods of building self-service BI data models and dashboards over 100+ million rows of data.
Space is limited to 30 participants max. Teams of 3 or less. Max of 10 teams.
Sign up before space runs out. Teams are selected on a first-come, first-serve basis.
Requirements
Must use Databricks in your solution. The final project must be submitted via a demo video through the Hackathon Microsoft Teams channel via Blueprint. It is highly encouraged that participants live in the region they register for, for networking purposes.
Team Formation
All participants must register through this Eventbrite invite. You will then receive an invite to a Microsoft Teams site to access the Hackathon details. There you will be able to register your team, assigning yourself a team name and captain, and receive an assigned mentor from Databricks or Blueprint Technologies.
Prizes
1st place: $500 split between winning team + mentions on Databricks’ and Blueprint’s social media channels.
People’s Choice: Mentions on Databricks and Blueprint’s social media channels + virtual high five from the judges.