Archives: Big Data

The Business Case For Apache Beam

You’ve just learned about a new streaming data processing technology that would solve many of the technical challenges you are experiencing within your organization today. Unfortunately, it would require significant time and budget to integrate and operationalize within your current solution. Enter Apache Beam. According to the main website, “Apache Beam provides an advanced unified programming • Read More »


Demo Driven Development: PySpark

In a previous blog post, I talked about demo driven development and focusing on demonstrating business value when pursuing development efforts with new technology. If you are an organization that is focused on service based delivery, you may find yourself having to demonstrate your capabilities to deliver a potential solution to a problem that hasn’t • Read More »


Demo Driven Development: Apache Spark

Technology alone does not solve problems. Back in 2014 at Techonomy, Jack Dorsey, the cofounder of Twitter and Square, put it very well: “To me, technology fundamentally is just a tool. It’s up to us to figure out how to use those tools and how to apply those tools”. In theory, this view of technology • Read More »


An Entry Into Big Data With Spark: Databricks Community Edition

During a local Spark meetup, I was introduced to the Databricks Community Edition. Within the Spark community, Databricks is well-known, so I was excited when I got my early invite to try out the Community Edition. This brief article will be a mix of overview, step-by-step instruction and opinion. However, by the end, you’ll have • Read More »