Skip to content
  • What we do

      Technology

      Application development
      Cloud and infrastructure
      Data migration
      Data science and analytics
      IoT enablement
      Modern data estate
      Modern workplace
      Video analytics

      Services

      Support engineering
      Localization

      Products

      Lightweight data virtualization tool

      Advanced video analytics solution

      Strategy

      Data Science
      Maturity Assessment

      Assess your data science maturity (or readiness) to receive a custom report with industry best practices

      Take the assessment
  • Our approach
  • Our work
  • Insights
  • Careers
Connect
Back to insights

The business case for Apache Beam

By Gary Nakanelua

You’ve just learned about a new streaming data processing technology that would solve many of the technical challenges you are experiencing within your organization today. Unfortunately, it would require significant time and budget to integrate and operationalize within your current solution.

Enter Apache Beam.

According to the main website, “Apache Beam provides an advanced unified programming model, allowing you to implement batch and streaming data processing jobs that can run on any execution engine.” It’s analogous to a general contractor; they utilize specialized subcontractors to perform the work yet you only have to interact with the general contractor. If you need a new roof on your home because a previous subcontractor did a sub-par job, you only have to work with the general contractor. They don’t have to rebuild the entire house; they simply hire a new subcontractor to put on a new roof.

Dealing with “Out of Scope”

Today’s agile sprint teams are driven by their solution backlog. This backlog is filled with bugs, feature requests and spikes written to address needs that should be delivered by the current solution. Yet, how often does a feature get requested, only to have the technical team dismiss it as “out of scope”? They note the original specification document didn’t include any mention of the need for stateful computations, event-time windowing or some other fancy set of words used to describe the technical approach to address your request. “If only you had made it part of the original requirements,” they say, “then we could have accounted for it in our architecture and approach”.

So another project team is started. One tasked to create the “v-next” version of the original solution that will include all current functionality plus the new features requested. It will be leaner, meaner and created in the latest technology so as to avoid the mistakes of the past. “It will scale with all your needs” the super motivated project team touts. Product backlogs are created. Releases are made. The world rejoices until an “out of scope” feature is requested. Then the cycle repeats itself. As a decision maker, how do you break this cycle?

Enter Apache Beam.

Beam gives you a unified, portable and extensible solution from which to answer your top level streaming architecture decisions. I’ve had the pleasure of meeting and talking with Andrew Psaltis, author of “Streaming Data: Understanding the real-time pipeline” on several occasions. In his Apache Beam presentation at QCon in 2016, he noted:

“You can switch to whatever is more performant, more scalable, maybe something that requires a smaller footprint. Whatever your requirements are, it becomes easy to switch”.

You can view his presentation in its entirety at https://www.infoq.com/presentations/apache-beam.

Encouraging The “New Hotness”

Engineers and developers love working with new frameworks, libraries and api’s. Whether it’s for performance, ease of development, speed of deployment or just intellectual curiosity, the desire to utilize < insert new technology here /> will always be a topic of conversation within technical teams.

Consider stream processing computation engines. In the last six years, we’ve seen Storm, Spark, Flink and Apex grow in popularity (to name a few). Each is/was the “new hotness” and all promise scalable, performant and fault tolerant solutions to today’s streaming data problems. In practice, they all have their pros and cons when used within a solution for any given organization. How do you enable a technical team to stay relevant, curious and motivated to experiment with the next big thing without draining your budget?

Enter Apache Beam.

Admittedly, my interest in Apache Beam grew from a conversation I had with another engineer, Ryan Harris, at a local Apache Spark meetup. I’ve spent a lot of time with Spark and wanted to see what his excitement was all about.

I ran through the Python quick start at https://cloud.google.com/dataflow/docs/quickstarts/quickstart-python with a local runner. Next I gave it a go with Google’s Cloud Dataflow runner. Finally, I ran it using the Spark runner. Aside from a few local development environment configuration adjustments (those were my own fault), Apache Beam let me experiment with capabilities from a few different technologies quickly.

You can check out the current Apache Beam capability matrix at https://beam.apache.org/documentation/runners/capability-matrix/. Don’t see the latest technology listed? Apache Beam is open source and has well-documented SDK’s so new runners can be created. Plus, Apache Beam is a core component of Google’s Cloud Dataflow service, so look for new additions to Apache Beam all the time.

Conclusion

As a decision maker, you want the peace of mind that a technical solution can scale with future business needs and enable innovation within your organization through technology experimentation. Apache Beam is a worthwhile addition to a streaming data architecture to give you that peace of mind.

Let's build your future.

Contact us

Share with your network

You may also enjoy

Article

Connecting your Point-of-Sale data to enhance your customer loyalty program

From stranger to super-fan: 5 ways to give customers what they want, when they want it

Traditionally, customer loyalty programs focus strictly on signups and discounts. That’s not enough anymore. The solution lies in connecting your data to your customer experience.

Article

Customer personalization is a must in the dining industry

 What the top QSR chains are doing that others aren’t 

Since the peak in sales during the Covid-19 shutdown, fast food traffic has continued to see its post-pandemic decline. Consumers still want fast food, but they don’t want to go back to pre-pandemic times. They have new demands.

What we do

  • Cloud and infrastructure
  • Data migration
  • Modern data estate
  • Modern workplace
  • Data science and analytics
  • Application development
  • IoT enablement
  • Video analytics
  • Support engineering
  • Localization
Menu
  • Cloud and infrastructure
  • Data migration
  • Modern data estate
  • Modern workplace
  • Data science and analytics
  • Application development
  • IoT enablement
  • Video analytics
  • Support engineering
  • Localization

Our approach

  • Business strategy
  • Facilitated innovation
  • Project Definition Workshop
  • Course of Action Assessment
  • Proof of Concept
  • Product development
  • Solution development
  • Managed services
Menu
  • Business strategy
  • Facilitated innovation
  • Project Definition Workshop
  • Course of Action Assessment
  • Proof of Concept
  • Product development
  • Solution development
  • Managed services

Our work

Insights

Careers

Contact us

Nash Video Analytics
Linkedin Youtube Twitter Facebook Instagram
© 2022 Blueprint Technologies, LLC. 2600 116th Avenue Northeast, First Floor
Bellevue, WA 98004

All rights reserved.

Privacy Policy

  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Menu
  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Follow
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram
Menu
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram