Zero Copy Basics

0️⃣💾 the most concise explanation of the operating system's zero-copy concept in 2 minutes

Zero Copy

If you’ve ever read about Kafka, a particular optimization it makes use of might have caught your eye — the operating system’s zero-copy optimization.

A zero-copy operation is one which does not make unnecessary copies of the data.
(it doesn’t actually mean you make literally zero copies)

In Kafka’s case → it is when the OS copies the data from the page cache directly into the socket buffer, effectively bypassing the Kafka broker Java program entirely.

This saves you a few extra copies and user <-> kernel mode switches.

Let us follow an example:

No Zero Copy

If your app’s job is to read a file from the disk and send it over the network, a bunch of unnecessary copies and user/kernel mode switches can be made.

Some terminology:

  • read buffer - this is the OS page cache.

  • socket buffer - this is an OS byte buffer for managing packets.

  • NIC buffer - a byte buffer in the network card.

  • DMA copy - DMA stands for Direct Memory Access - a feature in memory controllers, which allows hardware (graphic card, sound card, network card, etc.) to access the memory (RAM) without the CPU’s involvement.

In this example, we have 4 mode switches and 4 data copies.

  1. app initiates the disk → OS buffer DMA copy (user → kernel mode)

  2. read buffer → app buffer copy (kernel → user mode)

    1. (steps 1, 2 can be run in a loop if you have to read more than what the read buffer can hold)

  3. app → socket buffer copy (user → kernel mode)

  4. socket buffer → NIC buffer DMA copy (kernel → user mode after the response is written out)

We can do better.

Zero Copy

Kafka stores the data in the same binary format it responds to requests with.

It made no sense to do the original steps 2 and 3, as Kafka didn’t do anything with the given data - it would simply pass it back to the kernel.

With zero-copy, the data is NOT copied to Kafka - it directly goes to the NIC buffer.

Notice that there is another optimization here - the read buffer directly copies data to the NIC buffer - not to the socket buffer.

This is the so-called scatter-gather operation (a.k.a Vectorized I/O).

scatter-gather - the act of only storing read buffer pointers in the socket buffer, and having the DMA engine read those addresses directly from memory.

The end result?

  • 2 user/kernel mode switches. (2 less)

  • 2 DMA copies (the same)

  • 1 miniscule CPU copy of pointers. (2 less)

In Kafka

And now for the hard truth - zero-copy isn’t that impactful in most Kafka deployments.

CPU is rarely the bottleneck. The network gets saturated much faster, so the lack of in-memory copies doesn’t move the needle in most cases.

Plus, encryption & SSL/TLS already prohibit Kafka from using zero-copy - and Kafka still performs!

Liked this edition?

Help support our growth so that we can continue to deliver value!

And if you really enjoy the newsletter in general - please forward it to an engineer. It only takes 5 seconds. Writing it takes me 5+ hours.

🗣This Week’s Socials

I’ve started posting a bit less frequently on the socials recently, and I do expect it to stay intermittent throughout next month (traveling) - but the ones I did post, I reckon, were very good:

Apache®, Apache Kafka®, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.