Apache Kafka 3.0 - Improving KRaft and an Overview of New Features Artwork

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Hi, we’re Tim Berglund, Adi Polak, and Viktor Gamov and we’re excited to bring you the Confluent Developer podcast (formerly “Streaming Audio.”) Our hand-crafted weekly episodes feature in-depth interviews with our community of software developers (actual human beings - not AI) talking about some of the most interesting challenges they’ve faced in their careers. We aim to explore the conditions that gave rise to each person’s technical hurdles, as well as how their experiences transformed their understanding and approach to building systems.

Whether you’re a seasoned open source data streaming engineer, or just someone who’s interested in learning more about Apache Kafka®, Apache Flink® and real-time data, we hope you’ll appreciate the stories, the discussion, and our effort to bring you a high-quality show worth your time.

All Episodes

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Apache Kafka 3.0 - Improving KRaft and an Overview of New Features

September 21, 2021 • Season 1 • Episode 177

Apache Kafka® 3.0 is out! To spotlight major enhancements in this release, Tim Berglund (Apache Kafka Developer Advocate) provides a summary of what’s new in the Kafka 3.0 release from Krakow, Poland, including API changes and improvements to the early-access Kafka Raft (KRaft).

KRaft is a built-in Kafka consensus mechanism that’s replacing Apache ZooKeeper going forward. It is recommended to try out new KRaft features in a development environment, as KRaft is not advised for production yet. One of the major features in Kafka 3.0 is the efficiency for KRaft controllers and brokers to store, load, and replicate snapshots into a Kafka cluster for metadata topic partitioning. The Kafka controller is now responsible for generating a Kafka producer ID in both ZooKeeper and KRaft, easing the transition from ZooKeeper to KRaft on the Kafka 3.X version line. This update also moves us closer to the ZooKeeper-to-KRaft bridge release. Additionally, this release includes metadata improvements, exactly-once semantics, and KRaft reassignments.

To enable a stronger record delivery guarantee, Kafka producers turn on by default idempotency, together with acknowledgment delivery by all the replicas. This release also comprises enhancements to Kafka Connect task restarts, Kafka Streams timestamp based synchronization and more flexible configuration options for MirrorMaker2 (MM2). The first version of MirrorMaker has been deprecated, and MirrorMaker2 will be the focus for future developments. Besides that, this release drops support for older message formats, V0 and V1, as well as initiates the removal of Java 8 and Scala 2.12 across all components in Apache Kafka. The universal Java 8 and Scala 2.12 deprecation is anticipated to complete in the future Apache Kafka 4.0 release.

Apache Kafka 3.0 is a major release and step forward for the Apache Kafka project!

EPISODE LINKS

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
👍 If you enjoyed this, please leave us a rating.
🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.

Tim Berglund:
Apache Kafka 3.0 is available and is packed as usual with interesting KIPs. It continues the progress towards the only goals encompasses by KIP-500, that's Kafka without ZooKeeper. And adds a bunch of other interesting functionalities as well, making Kafka easier to use and I think even more cloud friendly. Streaming Audio is brought to you by Confluent Developer, that's developer.confluent.io, a website that's got everything you need to get started learning Apache Kafka. There are executable tutorials, there's a library of event-driven design patterns, there are video courses, everything you are going to need. So check it out. But first, let's check out Kafka 3.0.

Tim Berglund:
Now, if you remember 2.8 and the video for that where I was out in the woods pretending to be David Attenborough, you might be thinking, Tim, we already covered KIP-679. That was in 2.8. And you would be 100% correct. What 679 did was enable the strongest producer guarantees by default. That is enable.idempotence is set to true and access set to all by default. Well, historically ACL restrictions had an IDEMPOTENE_WRITE operation and a WRITE operation. As of 3.0, IDEMPOTENE_WRITE is being deprecated. It will be removed entirely in a future version. That version, not yet being specified. But this is a part of the whole KIP-679 thing. There really isn't a reason to have that IDEMPOTENE_WRITE restriction anymore.

Tim Berglund:
So you want to update your authorizers to use WRITE and not IDEMPOTENE_WRITE because, well, that'll be a compilation error at some point in the future. KIP-630 is the Kafka Raft Snapshot. All right. This is a part of the work that goes under the umbrella of KIP-500. Of course, that's in preview mode as of 2.8, still in preview mode for 3.0. It's not yet considered to be production ready. But this is just an improvement in that whole work stream.

Tim Berglund:
Let me break it down for you a little bit. So you've got the cluster_metadata topic. That's underscore underscore cluster metadata. That's an internal topic. That's the metadata log. And this is where metadata changes are kept, and well, I guess, negotiated on cluster wide to arrive at consensus about the state of the cluster, rather than using ZooKeeper for that. So that's the heart and soul of the Kraft protocol. Snapshots are desirable for a thing like that, right? Because in the quorum leader, that's the active controller, that's the quorum leader in KRaft mode, that has to get realized into a table in memory, right? And that's not a terribly uncommon thing to do in stream processing in general to have a topic that you make into a table like Kafka Streams does that, ksqlDB does that. That's all over the place.

Tim Berglund:
Well, what KIP-630 does is it introduces a new cleanup policy, right? Historically we've got Compact and Delete, our cleanup policies. Now we've got one called Snapshot. And right now it just works on this internal topic. I mean, I kind of think maybe this is going to grow legs and be useful elsewhere in Kafka in the future. Right now, it's just for creating snapshot of the metadata log for use by the quorum leader. The snapshot actually gets stored, as you can see on the slide here, as a file on disc. Super cool thing. It's a nice optimization for dealing with the metadata log. And again, watch this space, right? This is probably going to expand beyond just this topic in the future. That's not a guarantee. But it seems kind of likely. This is pretty cool stuff that you've probably wanted in the past. And it looks like you might have it now.

Tim Berglund:
KIP-724 drops supports for message formats V0 and V1. Now V2, which is the current version, dates back to 2017. I want to say 0.10. Internet, check me on that. But definitely a previous release of Kafka four years in the past, with respect to this release. Message formats apply to two places. One is the messages as they're actually serialized and written into partition logs on disk. And the other is in the client interface, right? The wire protocol that producers and consumers use to talk to brokers. So this refers to that end of things, clients talking to brokers. So if you've got older clients, clients compiled with versions of the library prior to 0.10, they'll still be using those old message formats. So those are now deprecated as of Kafka 3.0. They will be removed, or as I like to say, deprecated with extreme prejudice, as of 4.0. So now it's a good time to start getting those things updating. I mean, it's only been four years. We don't want to rush, I know. But time to start looking at that.

Tim Berglund:
KIP-709 extends the offset request API to allow for multiple group IDs. Now remember the OffsetFetch API, lets me get the current offset of a particular consumer group ID for each partition that it's responsible for. Now, why would you want to do that for multiple groups? That seems like a thing a consumer would do. A consumer is waking up and it's saying, hey, yo, what offset was I reading again last time I was live? Which would seem to be really located to just one consumer group. Well, if you're a monitoring application, say there are six consumers you want to show and you want to show each one of those consumer's position in a particular partition, you'd have to do that with six requests in the old API. Now you can just say, "Oh hey. Yeah, by the way, here are my six group IDs. Can I please have offsets for all of those?" So that's one example of a use case where this enhancement is a happy thing.

Tim Berglund:
Some language deprecations. So KIP-750 deprecates Java 8. KIP-751 deprecates Scala 2.12. Those are deprecated as in 3.0, support will be removed in 4.0 So this is an announcement that Java 8 and Scala 2.12 will no longer work in Kafka 4.0. They are currently deprecated. So it's time to move on. Again, I know. Old friends, comfortable places. It's like you feel like you know that version. That's when you really did all the great things you did. We need to move on. So we're definitely dropping support for those. As of this release, it's pretty typical. Most teams are probably using Java 11. This shouldn't impact too many people. But it's a thing to be aware of if you're still on Java 8. It's time to start moving on.

Tim Berglund:
KIP-707 is the future of KafkaFuture. And this actually kind of relates to Java 8. So the KafkaFuture object had to work with versions of Java prior to 8, when it was introduced. Kafka supported versions of Java prior to 8. And there was no completable future object completion stage interface back then in those days. And so they weren't a part of that API. Well, they are now. Now I've got a toCompletionStage and a toCompletableFuture method on KafkaFuture to get those standard Java API interfaces back out of KafkaFuture, and sort of into the broader standard Java future world.

Tim Berglund:
KIP-734 improves the AdminClient.listOffsets to give us an option for returning the offset of the message with the latest timestamp. Now, let me break this down a little bit. Historically we could ask for the earliest offset or the latest offset. Start at the beginning, start at the end. Or the first message with a timestamp greater than a given timestamp, right? So you could seek to a time, you could start at the end, you could start at the beginning. This now explicitly asks for the message with the highest timestamp on the partition.

Tim Berglund:
The idea here is that this is now an efficient way of figuring out whether a topic is alive, whether stuff is happening. And if you have a fairly low traffic topic, it kind of was a pain in the neck to do that with the previous versions. And this now just gives you a cleaner way to say, has anybody been here recently? Tap, tap, is this thing on, kind of thing. So that is what this KIP is about.

Tim Berglund:
KIP-720 is another deprecation KIP. All kinds of cleanup happening in 3.0. So MirrorMaker 1 is being marked as deprecated as of Kafka 3.0. Again, will be deprecated with extreme prejudice or removed in AK 4.0. So that's just a deprecation warning on MirrorMaker 1. With the advent of MirrorMaker 2 now more than two years ago, this should be a pretty safe thing. I shouldn't say more than two years ago. It's coming up on two years ago. But hey, close enough.

Tim Berglund:
Now KIP-716 allows us to configure the location of the offset syncs topic with MirrorMaker 2. So that's an internal topic that's used to sync offsets between the source and the target. And previously that had to be located with the source cluster. And now I get to specify whether it's source or target so it can live on either end, which cleans up some deployments in some cases.

Tim Berglund:
Onto Kafka Connect KIPs. KIP-745 is the Connect API can now restart tasks and the connector. Let's look at this. Previously when a connector or one of a connector's tasks failed due to an exception in just that task maybe, you could call the restart API and that would restart both the connector and its tasks. However, just calling restart only restarted the connector instance, not the tasks themselves. Now let me explain. Imagine something goes wrong with a connector. I know. It's data integration, nothing ever goes wrong. But in the rare that something gets a little bit messed up, there's an exception somewhere, you had a call to restart the connector.

Tim Berglund:
You'd expect that to restart the connector and its tasks. However, the failure may have happened just in one task. There could have been an exception for some reason or whatever. And it wouldn't restart all the tasks. Now it does. Or at least it can. With KIP-745, I have two optional parameters. One include tasks, which means please restart all the tasks as well. And only failed, meaning please restart only the failed task. So that's just a slightly cleaner way to give you a little bit more granularity into bouncing a connector where something has gone wrong.

Tim Berglund:
KIP-721 is enable connector log contexts in Log4j configuration. Now you may remember from, I believe it was the previous release, 449, gave us the ability to include connector contexts in a worker's log output. So it was a huge improvement. Just a lot easier to trace through the logs. Well, it was so great we decided we wanted it on by default. And that's really all this KIP does. So the work of 449 is now enabled by default. You don't have to enable it explicitly.

Tim Berglund:
Now onto Kafka Streams and KIP-695, further improve timestamp synchronization. Let's look at this one. Now when Streams is reading from multiple partitions, right, which is a completely realistic scenario, there could be multiple partitions that an individual stream processing instance could be responsible for, it has to choose which partition to read from next when it's getting a new message, right? That makes enough sense. Well, it always takes from the partition for which the next record has the least timestamp. It's trying to find the oldest message. Because in a sense, there's the most urgency to get that process, right? So looking for that lowest timestamp message from the available partitions. But you've got buffers from these several partitions that you may be reading. And one of those buffers may be empty. If that's the case, you're like, well, which do I do? I sit around and wait? I mean, I can't tell what the lowest timestamp is if I can't tell what's coming in on partition nine here or whatever. So there's a bit of a quandary.

Tim Berglund:
Previously, max.task.idle.ms would tell Streams how long to dwell before moving on. There's an empty partition, or empty partition buffer. How long should I wait before I just say, look, there's nothing there?I'll take my next best thing. But this was difficult to use in practice because there were other polling parameters that interacted. And so there's some slightly cleaner behavior now. So now setting max.task.idle.ms to zero means always wait to fetch data that's present on the broker. Don't care how long it takes. Just going to wait for the buffer to fill up. Setting it to negative one means never wait to buffer extra data. That was the prior default. Setting it to something greater than zero now is applying extra time to wait after are catching up to the broker. So if there's a slow producer in the pipeline, something like that, that's maybe the setting that you'd use there. So hopefully cleaner and easier to optimize behavior for your Streams applications here.

Tim Berglund:
And now we have KIP-633, which is removing the default 24 hour grace period for windows. Now, when you've got windows defined, we always have to deal with the possibility of late arriving data. So a window closes. But if there's timestamps in the message, that message time or event time may be out of order. And so you may have have an event arriving after a window has closed whose timestamp suggests it should have been in the previous window. If so, you have to recompute results for that previous window.

Tim Berglund:
Well, that raises the question of how long do you wait? You have to keep those windows around in the event that maybe an event arrives that belonged there, and sort of keep that stuff ready to reprocess. And you have to set a default or some kind of maximum on when you're just going to let those previous windows expire. KIP-633 gives us flexibility there. In some cases, the 24 hour default is just too long and can lead to difficult to understand behavior. And this grace period, I mean, I was just explaining how it works in windows. It really is a fundamental concept. It's not some accidental thing. And so a little bit slightly more granular control over it seems like a good idea.

Tim Berglund:
So 633 removes APIs that use the default grace period. Replacing them with two new methods, AndGrace and WithNoGrace. Now that second one, just a chilling name with that method. Should be easy to remember. So now no new methods in the time windows class are of size AndGrace and of size WithNoGrace. So that just tacked onto those method names so you're explicit about grace period behavior now. It's not a thing that you have to guess about. It really is a thing you get to make a decision about. Now that's all I've got. As usual, this is just a selection of the KIPs in the release. It's not everything. You want to check out the blog posts, you want to read the release notes. That's always the definitive list of everything. And let us know what you build.

Tim Berglund:
And there you have it. I hope this podcast was helpful to you. If you want to discuss it or ask a question, you can always reach out to me at @tlberglund on Twitter. That's @ T L B E R G L U N D. Or you can leave a comment on a YouTube video or reach out in community Slack. There's a Slack sign-up link in the show notes. If you want to register there. And while you're at it, please subscribe to our YouTube channel. And to this podcast, wherever fine podcasts are sold. And if you subscribe through iTunes, be sure to leave us a review there. That helps other people discover the podcast, which we think is a good thing. So thanks for your support. And we'll see you next time.