Skip to content
Blueprint Technologies - Data information specialists
  • What we do

      Technology Solutions

      Application development
      Cloud and infrastructure
      Data governance
      Data migration
      Data science and analytics
      Ethos privacy platform
      IoT enablement
      Modern data estate
      Video analytics

      Solution Accelerators

      Data Catalog
      Data Loader
      Data Sharing Portal
      Datalake Query Editor
      Ethos Privacy Program
      Lakehouse Monitor

      Supportive Services

      Privacy consulting services
      Support engineering
      Localization

      Partnerships

      Databricks Partnership

      We specialize in using the power of the Databricks Lakehouse to help our clients solve real-world business problems

      Learn more
  • Our approach
  • Our work
  • Insights
  • Careers
Connect
Blueprint Technologies - Data information specialists
Back to insights

Stop overprocessing data

By Eric Vogelpohl

“We’re starving for data!”

This is something I regularly hear as I meet with stakeholders from companies large and small. Perhaps the most painful message came from the Director of Marketing of a large restaurant chain. We were on our second conference call when she said, with a sigh of frustration I’ll never forget, “Does this mean my team will be able to get current item data to analyze without having to request it from IT each time?”

My professional passion is being a data evangelist for people who want to make business more efficient. In today’s fast-paced world, hearing comments like that is a kick in the gut for me.

I’ve been in the data business for years. I have watched architectural patterns evolve and morph from enterprise data warehousing to operational data stores/integration hubs to logical data marts/data virtualization to big data stores and finally to ETL, ELT and Reverse-ELT.

Undoubtedly, these have been exciting times in the data business because each of those 5 data management patterns has merits in the right context. However, they fail to consider three things that often force BI and data leaders to extend their data pipelines through to successive data stores, thus overprocessing data and increasing the time needed to get data to consumers.

1 - Modern platforms are different

The world has seen a rapid rise in Software as a Service (SaaS) platforms, such as Dynamics, Workday and Salesforce, all of which have far simpler data schemas than the complex line of business and custom applications of old. Older systems stored data within a database optimized for efficiently writing data. Data reading, on the other hand, required numerous steps to extract data from tables, perform joins, flatten and convert oddly named columns names into something humans could consume.

Modern platforms require little processing to support analytics and data discovery. Indeed, many digitally native platforms have out-of-the-box integrations with leading cloud data warehouse vendors ready to deliver data directly to users.

Blueprint Technologies advocates for a fast-lane approach to pipe data from these services into a cloud-based data warehouse. The speed at which these data are delivered to analysts correlates directly to value. A data architecture should support as direct a path as possible for an organization.

2 - Modern data warehouses can scale up and down

Cloud-based data warehouses allow for utility-based scaling

In the past, the performance of a data warehouse depended on how much a company spent on compute nodes. Data warehouse and BI managers had to request and allocate capital to purchase additional compute and storage resources – often over yearly budgeting cycles. This meant there was an intense focus on curtailing query loads. One strategy was to – yet again – process data from the warehouse into data marts. These data marts held less data and were optimized for ad hoc queries.

While this was a way to protect the warehouse from uninformed users doing baseline analysis that could become costly, the data sets eventually provided to users were so overprocessed, and often only covered a short timeframe, it left little room for exploring data in any innovative way.

For cloud-based data warehouses, however, users exploring data in ad hoc or unprescribed ways have far less an impact on the system. Cloud data warehouses can scale dynamically using various techniques. There is often no need to create these secondary data marts because the compute behind a data warehouse can auto-scale up and down with a query.

Blueprint believes the modern data estate has at its heart a cloud data warehouse. As data inform decisions (or should be informing decisions) at a much greater rate, Blueprint advises clients to seriously consider the reasons driving additional data processing in a data pipeline.

3 - End users are more data savvy than you think

Historically, a single major contributor drove the need to overprocess data into secondary data marts and data cubes. BI tools (Excel Pivot, Power BI and other enterprise reporting and analytical platforms) relied on data to be in a specific format to support a click-and-drag user experience.

While that may be appropriate for a certain class of users, data managers have quickly found that employees today have the skills and access to data exploration tools that often negates the need to create these tailored data marts. Ten to 15 years ago, data needed to be overprocessed because end users didn’t know anything more than fundamental report navigation – dumping data into Excel and making pivot tables. Now though, whether a student received a degree in finance, MIS or engineering, they often understand statistics, possess SQL skills and are quite savvy with modern data tools.

Blueprint is passionate about unleashing the power of data. Overprocessing data and reducing the degrees of freedom these savvy analysts have in exploring insights and outliers for a business is an ever-greater risk to improving operations. Carefully consider the risk/reward tradeoff to overprocessing and delaying access to data.

Aim important data pipelines & processing rituals at the right areas

Certainly, it is necessary to recognize that there are valid reasons to process data thoroughly. Formal financial statements, SEC filings and other scenarios in which transactional auditing and data precision within the confines of a financial or planning data model are high on that list.

But when it comes to connecting data from many other business areas, companies should question the need to overprocess. When exploring trends about tickets from the field, it doesn’t materially change anything if the numbers are off slightly. Companies need to get over their fears of handing over raw data to their BI engineers and analysts.

Let’s start a conversation about how Blueprint can help you establish a cloud-based data estate that prioritizes getting data to users quickly, enabling data-driven business decisions in a competitive landscape.

Let's build your future.

Share with your network

You may also enjoy

Article

6 Quick Wins for Databricks Cloud Cost Optimization

Today’s challenges, marked by the new normal, a looming recession, persistent inflation, layoffs, rising prices, and an ongoing supply chain crisis, have elevated cloud cost optimization to a pressing concern—with solutions like Databricks emerging as particularly effective.

Article

Finding the Competitive Advantage in Data

Navigating Data Privacy in 2023

Taking a proactive approach to compliance and future-proofing your data privacy program.
Blueprint Technologies - Data information specialists

What we do

  • Application development
  • Cloud and infrastructure
  • Data governance
  • Data migration
  • Data science and analytics
  • IoT enablement
  • Localization
  • Modern data estate
  • Privacy consulting services
  • Support engineering
  • Video analytics
Menu
  • Application development
  • Cloud and infrastructure
  • Data governance
  • Data migration
  • Data science and analytics
  • IoT enablement
  • Localization
  • Modern data estate
  • Privacy consulting services
  • Support engineering
  • Video analytics

Our approach

  • Business strategy
  • Course of Action Assessment
  • Facilitated innovation
  • Managed services
  • Product development
  • Project Definition Workshop
  • Proof of Concept
  • Solution development
Menu
  • Business strategy
  • Course of Action Assessment
  • Facilitated innovation
  • Managed services
  • Product development
  • Project Definition Workshop
  • Proof of Concept
  • Solution development

Our work

Insights

Careers

Accelerator Support

Contact us

Linkedin Youtube Twitter Facebook Instagram
© 2022 Blueprint Technologies, LLC. 2600 116th Avenue Northeast, First Floor
Bellevue, WA 98004

All rights reserved.
Media Kit

Employer Health Plan

Privacy Notice
  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Menu
  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Follow
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram
Menu
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram