Combining strict order with massive parallelism using Kafka

Getting the best of both worlds

Emil Koutanov
codeburst
Published in
7 min readJan 25, 2021

--

Having been involved in several large-scale Kafka projects for different clients across a broad range of industries, I have heard my fair share of questions on Apache Kafka — ranging from the fundamental to the esoteric. One question that never seems to go out of fashion is: How can you maintain strict order, yet still process records in parallel?

And it’s a fair question. Strict order assumes linearizability, the very notion of which seems to contradict with the objectives of parallelism.

Partial and total order

We will start by exploring the notion of order.

As expected of an event-streaming platform, Kafka preserves the order of published records, providing those records occupy the same partition. In order to understand what this means in practice, one needs to explore the architecture of Kafka topics, and the underlying sharding mechanism — partitions.

Note: Kafka uses the term ‘record’, where others might use ‘message’ or ‘event’.

--

--

Software architect, an engineer, and a dad. Also an avid storyteller, maintainer of Kafdrop, and author of Effective Kafka.