Skip to content
Blueprint Technologies - Data information specialists
  • What we do

      Technology

      Application development
      Cloud and infrastructure
      Data governance
      Data migration
      Data science and analytics
      IoT enablement
      Modern data estate
      Video analytics

      Services

      Support engineering
      Localization

      Partnerships

      Databricks Partnership

      We specialize in using the power of the Databricks Lakehouse to help our clients solve real-world business problems

      Learn more
  • Our approach
  • Our work
  • Insights
  • Careers
Connect
Blueprint Technologies - Data information specialists
Back to insights

Machine learning for time-series data in the Bayesian perspective

By Kyung Kim

I used to be a faculty member of the Department of Bioengineering at the University of Washington in Seattle. My lab’s research was motivated by a question on how to engineer microbes that are safe by preventing unwanted mutations and providing predictive behaviors of newly implemented cellular functions. To answer these questions, I took diverse mathematical and statistical approaches. One of the recent approaches that I used is Bayesian inference. This Bayesian method was applied to identify and characterize mathematical models to explain observed noisy time series cellular signals. 

Now at the Blueprint Consulting Services, I push my boundary toward industrial applications of various mathematical and statistical approaches to answer diverse problems in data science including sales and marketing, intelligent crude oil pump systems, advertisement bidding, and cyber-security, as well as specific fields within the biotech industry.

Time-Series Data with the Limited Sample Size 

Time series data are often found in diverse areas of industry, and can help answer the following questions, to name a few:

  • “How likely is it that individual customers will re-subscribe to my online services?”
  • “What are the future sales of my products? What are the probabilities?”
  • “When are my devices going to fail to operate? What are their chances of failure in a certain timeframe?”

These questions are often answered by understanding the relationship between data at multiple time points and by taking into account seasonality and trends. This is often done by applying autoregressive integrated moving average (ARIMA) models.  However, when time series data are not large enough or show significant variability, different analysis methods that are more appropriate may need to be introduced.  One such alternative method is Bayesian inference.

Bayesian Inference: Approximate Bayesian Computation

Here, I introduce a Bayesian approach that is not yet recognized well in the data science industry, so called Approximate Bayesian Computation (ABC). This approach is composed of three steps: (1) sampling parameter values of a mathematical model, based on the prior knowledge of your system (2) generating synthetic data by running simulations of the model, and comparing them with the observed data, and (3) selecting the best choices of parameter values. Thus, the ABC method can provide probability to obtain parameter values of a mathematical model that explains a given dataset. 

This approach can be used for various classes of problems, whether the observed data are (nearly) continuous or discrete, or smooth (deterministic) or noisy. More importantly, due to the Bayesian nature, small data sets can be trained to improve mathematical models. This Bayesian approach does not provide one definitive prediction, but provide a range of predictions with their corresponding probabilities. This probability-based prediction provides information on how confident we are in the prediction and thus how confidently the prescribed action can be placed for your desired goals. 

Forecasting Sales in the Supply Chain Management

Let us consider the problem of forecasting sales in wireless phone service providers. In this business, it is important to forecast the demands of phones and supply them to individual stores right on time. If sales can be forecasted accurately, the demands for new phones can be reliably predicted and phones can be supplied in a timely manner.

Here, the sales can be affected by a number of factors, which can fluctuate in time randomly. Thus, forecasting sales is by nature based on odds, i.e., probability. The Bayesian approaches can be appropriate for this case.

To forecast sales based on historical sales data, mathematical models can be proposed by incorporating various factors such as the promotions of the company and its competitors, weekly or monthly visits to its local stores, weekly or monthly sales, and local demographic information. Once the models are built, the parameter values of the mathematical models can be inferred, by applying the best guess on parameter values (more specifically the prior probability distribution of the values), and then selecting parameter values that explain the observed data well. This Bayesian approach gives you the updated (posterior) probability distribution of the parameter values based on your observed data. This procedure can be repeated until the parameter distributions do not change further. The final selection of the parameter values will provide a collection of sales value predictions with their corresponding likelihood. Based on this prediction likelihood, future sales values can be forecasted with a given confidence interval.

Decision tree based approaches such as the Random Forest, XGBoost, and AdaBoost, can be used for sales forest as well. These approaches can provide high predictive power without overfitting. However, prescriptive actions that can be obtained from these approaches are not intuitive, simply because these approaches build black box models, not mechanistic ones. Although the ABC method may, however, face challenges in coming up with reasonable mathematical models, once appropriate models are proposed, you can systematically (in the Bayesian way) predict the parameter values and prescribe the action of plans based on the mechanisms built in the models!

Summary

Various machine learning algorithms have been developed to meet the needs of prediction for different systems. The Bayesian methods such as ABC have been developed to bypass the construction of likelihood functions and to be applied for various classes of systems including nonlinear stochastic dynamical systems. This approach can provide mechanistic understanding of the time series data and even prescriptive action plans for future, once appropriate mathematical models are provided based on the underlying mechanisms. This Bayesian approach has been widely applied in medical, biological, and bioengineering fields, including systems and synthetic biology, population genetics, ecology, epidemiology, and oncology. I believe that the ABC approach will be appreciated further in other fields of data science research and industry.

Let's build your future.

Contact us

Share with your network

You may also enjoy

Article

Increase revenue and increase speed-to-value in your cloud operations with Databricks

Databricks Migration Keys to Success: Cost Reduction, Optimization and Accountability

Whether you’ve decided to update your data estate by moving onto the cloud, are in the middle of a cloud migration, or just starting to learn about cloud migration and its benefits, here are our methodologies and best practices.

Article

Localization in retail helps you connect with your customer base on their level

Localization for Retail: Driving Customer Experience

Localization is not just an added layer of convenience for your customers or a necessary step to sell your product to more people – it’s a way of showing your audience that you recognize who they are even before they become customers.
Blueprint Technologies - Data information specialists

What we do

  • Application development
  • Cloud and infrastructure
  • Data governance
  • Data migration
  • Data science and analytics
  • IoT enablement
  • Localization
  • Modern data estate
  • Support engineering
  • Video analytics
Menu
  • Application development
  • Cloud and infrastructure
  • Data governance
  • Data migration
  • Data science and analytics
  • IoT enablement
  • Localization
  • Modern data estate
  • Support engineering
  • Video analytics

Our approach

  • Business strategy
  • Course of Action Assessment
  • Facilitated innovation
  • Managed services
  • Product development
  • Project Definition Workshop
  • Proof of Concept
  • Solution development
Menu
  • Business strategy
  • Course of Action Assessment
  • Facilitated innovation
  • Managed services
  • Product development
  • Project Definition Workshop
  • Proof of Concept
  • Solution development

Our work

Insights

Careers

Accelerator Support

Contact us

Linkedin Youtube Twitter Facebook Instagram
© 2022 Blueprint Technologies, LLC. 2600 116th Avenue Northeast, First Floor
Bellevue, WA 98004

All rights reserved.

Employer Health Plan

Privacy Policy
  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Menu
  • What we do
  • Our approach
  • Our work
  • Insights
  • Careers
  • Connect
Follow
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram
Menu
  • LinkedIn
  • Youtube
  • Twitter
  • Facebook
  • Instagram