- 2 Minute Streaming
- Posts
- Zero Copy Basics
Zero Copy Basics
0️⃣💾 the most concise explanation of the operating system's zero-copy concept in 2 minutes
![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/232c1770-1dfd-4ab5-9ef4-4f49cd4ac526/zerocopyos.png?t=1691419897)
Zero Copy
If you’ve ever read about Kafka, a particular optimization it makes use of might have caught your eye — the operating system’s zero-copy optimization.
A zero-copy operation is one which does not make unnecessary copies of the data.
(it doesn’t actually mean you make literally zero copies)
In Kafka’s case → it is when the OS copies the data from the page cache directly into the socket buffer, effectively bypassing the Kafka broker Java program entirely.
This saves you a few extra copies and user <-> kernel mode switches.
Let us follow an example:
No Zero Copy
If your app’s job is to read a file from the disk and send it over the network, a bunch of unnecessary copies and user/kernel mode switches can be made.
![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/80806dc0-d7fb-42d9-8634-a4fc618585cf/nozerocopymail.png)
Some terminology:
read buffer - this is the OS page cache.
socket buffer - this is an OS byte buffer for managing packets.
NIC buffer - a byte buffer in the network card.
DMA copy - DMA stands for Direct Memory Access - a feature in memory controllers, which allows hardware (graphic card, sound card, network card, etc.) to access the memory (RAM) without the CPU’s involvement.
In this example, we have 4 mode switches and 4 data copies.
app initiates the disk → OS buffer DMA copy (user → kernel mode)
read buffer → app buffer copy (kernel → user mode)
(steps 1, 2 can be run in a loop if you have to read more than what the read buffer can hold)
app → socket buffer copy (user → kernel mode)
socket buffer → NIC buffer DMA copy (kernel → user mode after the response is written out)
We can do better.
Zero Copy
![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bf44ccc2-85f8-4055-8458-fbe6055da3ec/zerocopyemail.png)
Kafka stores the data in the same binary format it responds to requests with.
It made no sense to do the original steps 2 and 3, as Kafka didn’t do anything with the given data - it would simply pass it back to the kernel.
With zero-copy, the data is NOT copied to Kafka - it directly goes to the NIC buffer.
Notice that there is another optimization here - the read buffer directly copies data to the NIC buffer - not to the socket buffer.
This is the so-called scatter-gather operation (a.k.a Vectorized I/O).
scatter-gather - the act of only storing read buffer pointers in the socket buffer, and having the DMA engine read those addresses directly from memory.
The end result?
2 user/kernel mode switches. (2 less)
2 DMA copies (the same)
1 miniscule CPU copy of pointers. (2 less)
![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3745432f-d779-48fd-8352-3ff047d0b4e7/zero_copy_vs_no_zero_copy.png)
In Kafka
And now for the hard truth - zero-copy isn’t that impactful in most Kafka deployments.
CPU is rarely the bottleneck. The network gets saturated much faster, so the lack of in-memory copies doesn’t move the needle in most cases.
Plus, encryption & SSL/TLS already prohibit Kafka from using zero-copy - and Kafka still performs!
Liked this edition?
Help support our growth so that we can continue to deliver value!
And if you really enjoy the newsletter in general - please forward it to an engineer. It only takes 5 seconds. Writing it takes me 5+ hours.
🗣This Week’s Socials
I’ve started posting a bit less frequently on the socials recently, and I do expect it to stay intermittent throughout next month (traveling) - but the ones I did post, I reckon, were very good:
I used to misunderstand Kafka’s concept of the high watermark.
That’s normal - the concept is entangled with advanced distributed systems terms - replication, fault tolerance & durability.
But once I visualized it, it became clear.
Let me explain it first, and then you can… httptwitter.com/i/web/status/1…p
— Stanislav Kozlovski (@BdKozlovski)
3:29 PM • Jul 22, 2023
One thing determines your long-term success in managing your own Apache Kafka cluster…
Replica distribution.
It’s probably THE determining factor in how well your cluster scales and is used throughout its lifespan.
It is a huge topic in Apache Kafka. (no pun intended)
And… twitter.com/i/web/status/1…— Stanislav Kozlovski (@BdKozlovski)
7:08 AM • Aug 1, 2023
This is what an optimal real-time analytics data infrastructure looks like:👇
Uber has paved the way in showing how to both:
• build infrastructure to support massive amounts of data.
• leverage the data in diverse, often conflicting, use cases.Each day, they process… twitter.com/i/web/status/1…
— Stanislav Kozlovski (@BdKozlovski)
3:06 PM • Jul 19, 2023
Ever wish someone had made a performance checklist for Kafka?
Here it is. 🔥
PS: What did I miss? twitter.com/i/web/status/1…
— Stanislav Kozlovski (@BdKozlovski)
3:46 PM • Jul 21, 2023
🚨 3AM: wake up by page.
😳 4AM: Realize customer's deployment is deleted.
😥 5AM: Scramble & page teams to find remediation.
😵💫 11PM: Contact affected customers.
💀 14 Days Later: Incident fully mitigated.This should scare you.
Which lesson would have prevented this? http
— Stanislav Kozlovski (@BdKozlovski)
5:28 AM • Jul 28, 2023
Cloudflare serves around 20% of the web with 46 million requests a second.
Surely they must have a lot of data.
Where do they store it?
Plain old PostgreSQL. 🐘
Around 15-20 clusters of them.
Each cluster consists of 3 servers split into two regions.
The primary region is… twitter.com/i/web/status/1…
— Stanislav Kozlovski (@BdKozlovski)
7:08 AM • Jul 26, 2023
A company like Cloudflare knows a thing or two about protecting systems against client stampedes.
If Cloudflare protects customers with their DDoS Protection, Firewall & DNS Resolver products...
...and PostgreSQL serves the transactional workloads for those services… httptwitter.com/i/web/status/1…p
— Stanislav Kozlovski (@BdKozlovski)
2:23 PM • Aug 3, 2023
The best startups don’t use the latest fancy language, framework, or cloud service.
The best startups use:
boring technology.
Today’s case in point:
👉 Loom & Postgres 🐘Loom is a nifty little app that allows you to quickly send out screen + video recordings - a sort of… httptwitter.com/i/web/status/1…p
— Stanislav Kozlovski (@BdKozlovski)
2:10 PM • Aug 2, 2023
😰 Writing your stream processing with low-level APIs
😠 Writing your stream processing with high-level abstractions
What's right?
The right abstraction, of course! 😇But most companies and projects lack them.
Which is normal - as one size does not fit all.
The only way to… twitter.com/i/web/status/1…
— Stanislav Kozlovski (@BdKozlovski)
6:25 PM • Jul 20, 2023
Apache®, Apache Kafka®, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.