Skip to content Skip to footer

How to Design a Batch Processing. Understand batch processing from… | by Xiaoxu Gao | Jan, 2024

[ad_1]

Understand batch processing from business and technical perspective

Photo by Dannie Sorum on Unsplash

We live in a world where every human interaction becomes an event in the system, whether it’s purchasing clothes online or in-store, scrolling social media, or taking an Uber. Unsurprisingly, all these events are processed in one way or the other. Some events expect a quick response, so they are processed immediately. For instance, when completing a ride with Uber, you will receive the receipt in a few seconds. The input and output are usually 1-to-1.

2 different data processing modes

While other events create greater values when processed collectively in the background. An example is generating monthly reports where you need to combine all the transactions of this month. The input and output are usually many-to-1. This is also called batch processing.

As a data practitioner, we deal with batches every day. It is an old-school but still very powerful data processing method that every data person should know. As it’s such a fundamental area, there is much to explore. In this article, I will start with the use cases of batch processing — how businesses can benefit from it, followed by its technical aspects. By the end of the article, you should have an idea of how to work with batches effectively in your environment.

What is batch processing and why?

From the intro, we learned that batch processing is to process a group of events (aka a batch) in one job and it differs from transaction processing which handles one event at a time. Events in a batch usually have the same attributes and belong to the same business context.

An example batch (created by author)

In most cases, we choose batch processing for two reasons.

Business

Certain outputs can only be generated when a series of records are present. Examples are end-of-month report generation, payroll processing, billing, and invoicing systems. The…

[ad_2]

Source link