Member-only story

Combining strict order with massive parallelism using Kafka

Getting the best of both worlds

Emil Koutanov

Published in

codeburst

7 min readJan 25, 2021

Having been involved in several large-scale Kafka projects for different clients across a broad range of industries, I have heard my fair share of questions on Apache Kafka — ranging from the fundamental to the esoteric. One question that never seems to go out of fashion is: How can you maintain strict order, yet still process records in parallel?

And it’s a fair question. Strict order assumes linearizability, the very notion of which seems to contradict with the objectives of parallelism.

Partial and total order

We will start by exploring the notion of order.

As expected of an event-streaming platform, Kafka preserves the order of published records, providing those records occupy the same partition. In order to understand what this means in practice, one needs to explore the architecture of Kafka topics, and the underlying sharding mechanism — partitions.

Note: Kafka uses the term ‘record’, where others might use ‘message’ or ‘event’.

A partition is a totally ordered sequence of records, and is fundamental to Kafka. A record has an ID — a 64-bit integer offset, and a millisecond-precise timestamp. Also, it may have a key and a value; both are byte arrays and both are optional. The term ‘totally ordered’ simply means that for any given producer, records will be written in the order they were emitted by the application. If record P was published before Q, then P will precede Q in the partition. (Assuming P and Q share a partition.) Furthermore, they will be read in the same order by all consumers; P will always be read before Q, for every possible consumer. This ordering guarantee is vital in a large majority of use cases; published records will generally correspond to some real-life events, and preserving the timeline of these events is often essential.

The diagram below illustrates the linear arrangement of records in a single partition.

Yes, that was my epiphany a few years back as well. Now the question is, what does that mean in practice? And that is where another misconception must be alleviated: One Topic = one message type.

Kafka does not prevent to write different message…

codeburst

Combining strict order with massive parallelism using Kafka

Getting the best of both worlds

Partial and total order

Create an account to read the full story.

Published in codeburst

Written by Emil Koutanov

Responses (2)

More from Emil Koutanov and codeburst

Apache Kafka in a Nutshell

Architecture, Use Cases, and a Getting Started guide — rolled into one

How To Create Horizontal Scrolling Containers

As a front end developer, more and more frequently I am given designs that include a horizontal scrolling component. This has become…

An Introduction to 6502 Assembly and low-level programming!

It’s not that scary, I promise!

Contrasting Kafka with Akka

One the similarities and differences between two industry heavyweights

Recommended from Medium

Mastering Resilient Microservices with Spring Retry and Recovery Patterns

Building resilient microservices requires the ability to handle transient failures gracefully. Spring Retry provides a powerful solution…

Which scenarios can lead to data loss in Kafka?

Apache Kafka has emerged as a leading platform for real-time data streaming, processing, and event-driven architectures. While Kafka’s…

Kafka Streams — How to build an advanced stateful data stream processing

A practical example of real time account balance calculation using Kafka Streams processor & key-value state store

Kafka in Java Spring Boot with Filter and DLT

Apache Kafka is one of the most popularly used distributed event-streaming platforms, designed to handle real-time data feeds with high…

Real-Time Message Push Solutions Unveiled: From Polling to WebSocket, Which is Your Best Choice?

My articles are open to everyone; non-member readers can read theull article by clicking this link.

Top 15 DS Algo Interview Questions for Java Developers(Commonly Asked)

Hello guys, if you are you a software engineer or specifically a Java Developer and you have a fear of DSA in interview 2025 then you came…

codeburst

Combining strict order with massive parallelism using Kafka

Getting the best of both worlds

Partial and total order

Create an account to read the full story.

Published in codeburst

Written by Emil Koutanov

Responses (2)

More from Emil Koutanov and codeburst

Apache Kafka in a Nutshell

Architecture, Use Cases, and a Getting Started guide — rolled into one

How To Create Horizontal Scrolling Containers

As a front end developer, more and more frequently I am given designs that include a horizontal scrolling component. This has become…

An Introduction to 6502 Assembly and low-level programming!

It’s not that scary, I promise!

Contrasting Kafka with Akka

One the similarities and differences between two industry heavyweights

Recommended from Medium

Mastering Resilient Microservices with Spring Retry and Recovery Patterns

Building resilient microservices requires the ability to handle transient failures gracefully. Spring Retry provides a powerful solution…

Which scenarios can lead to data loss in Kafka?

Apache Kafka has emerged as a leading platform for real-time data streaming, processing, and event-driven architectures. While Kafka’s…

Kafka Streams — How to build an advanced stateful data stream processing

A practical example of real time account balance calculation using Kafka Streams processor & key-value state store

Kafka in Java Spring Boot with Filter and DLT

Apache Kafka is one of the most popularly used distributed event-streaming platforms, designed to handle real-time data feeds with high…

​Real-Time Message Push Solutions Unveiled: From Polling to WebSocket, Which is Your Best Choice?

My articles are open to everyone; non-member readers can read theull article by clicking this link.

Top 15 DS Algo Interview Questions for Java Developers(Commonly Asked)

Hello guys, if you are you a software engineer or specifically a Java Developer and you have a fear of DSA in interview 2025 then you came…

Real-Time Message Push Solutions Unveiled: From Polling to WebSocket, Which is Your Best Choice?