Streaming Audio: Apache Kafka® & Real-Time Data

Apache Kafka 2.8 - ZooKeeper Removal Update (KIP-500) and Overview of Latest Features

Confluent, original creators of Apache Kafka® Season 1 Episode 154

Apache Kafka 2.8 is out! This release includes early access to the long-anticipated ZooKeeper removal encapsulated in KIP-500, as well as other key updates, including the addition of a Describe Cluster API, support for mutual TLS authentication on SASL_SSL listeners, exposed task configurations in the Kafka Connect REST API, the removal of a properties argument for the TopologyTestDriver, the introduction of a Kafka Streams specific uncaught exception handler, improved handling of window size in Streams, and more.

EPISODE LINKS

Tim Berglund:
Apache Kafka 2.8 is finally here. And maybe you weren't waiting for that particular version number, but there's probably another number you've had your eye on, 500, specifically KIP-500. KIP-500 has been merged as of 2.8. And while it's not yet ready for production use, and not everything we have in mind for post-ZooKeeper Kafka is in place already, there are some really cool features here that are available for use in development. And of course, there are other cool features in the release besides that.

Tim Berglund:
As usual, I want to go through all the KIPs that made it into 2.8. Those are the major chunks of new functionality, the Kafka Improvement Proposals that are in 2.8. And we're going to divide them into three categories. That's going to be Core Kafka, Kafka Connect, and Kafka Streams. So let's dive in.

Tim Berglund:
KIP-676: respect the logging hierarchy. Now, if you know Log4J in Java, loggers are hierarchical. Logger names are usually modeled after package names like org, Apache, Kafka, clients, producer, or Apache Kafka, client's consumer, something like that. And there were three ways to change logging levels in Kafka. Two of those ways didn't respect the Log4J logging hierarchies. Namely, if you used the describe configs RPC call or the Log4JControllerMBean, good old MBean, you didn't get hierarchal rippling up of, of the log changes, but you did get that if you changed it with the rest API.

Tim Berglund:
Well now thanks to 676, you get it with all three. KIP 673 emit JSONs with new auto-generated schema. Now... There's a request tracing that you could turn on if you set request channel to debug, then you'd get a JSON like output from that request trace. And by JSON like, I mean, there were curly braces and we were supposed to respect that, but it wasn't parsable as JSON. Well, now with this, it actually is. So you can extract that and parse it with any tool that parses JSON. So huge win for interpreting logs in a smart way. KIP 679 producer will enable the strongest delivery guarantee by default. So it's been a little while now since we've had EOS exactly one semantics. I think that was KIP... I want to say 98. It was in the 90s and it was some years ago, about 4 years ago at the time of this recording, almost. So it's been with us, but it's not enabled by default.

Tim Berglund:
And you have two things, two switches. That's enable item potence set to true, for the out of potent producer and ax set to all. And by default, prior to 679, enable item buttons was off and ax was defaulting to one. And so you didn't have the strongest producer guarantee as well. Now enable item potency is on by default access set to all by default. Of course you can do whatever you want with these settings, but the defaults are just a little more robust for the producer now. KIP 684 is support mutual TLS authentication on SASL SSL listeners. The way this used to work...This is a little tricky, if you had, if you're, if you're using TLS client authentication with SASL SSL listeners, if you have SSL client auth configured, and if you had like multiple listeners connected to a broker with I'm sorry, SSL client auth set to required, then all of the listeners would have to support TLS client authentication.

Tim Berglund:
And the anti-pattern that grew up around this was that you basically have to distinguish client identity by deploying different certificates to each client and setting the distinguished name to a different field, which means more certificate management, which means literally zero people are happy. Jeremy Bentham call your office. Utility is not being maximized here. And actually Jeremy Bentham, you can't call your office. You don't need to. So now as you see in the middle of the slide, you can say, listener name and then a named listener, SSL client off and enable it that way and use SASL to identify clients. See the connection is encrypted with TLS and clients are identifying themselves with SASL, best of both worlds, actually using all the technologies that we've got on the table. So KIP 684 automakers certificate management, a little easier.

Tim Berglund:
KIP 700 adds a describe cluster API. This is nice. This has to be buried in the metadata API, but now has an API of its own. And this now gives you basic cluster. Metadata is likely to expand in the future with what it offers, but it's always nice to get a new, crisper admin API. This is another one of those things that you might say smells like rain. It feels like this is Kafka getting a little more cloud friendly. Cause it seems like a thing if you're going to be running multiple clusters in the cloud, you might want to know stuff about a cluster and know it well.

Tim Berglund:
Lots of uses for this quite apart from cloud deployments. But one of those things that I just kind of like to see happening, and of course the first release of KIP 500. Now I need to caution you before you break out the champagne or whatever it is you celebrate with. This is not production ready yet. Kafka's new built-in consensus protocol, for example is not doing ACLs yet. If you want to do ACLS, you still have to have ZooKeeper around, but you need to start working with this in your test environments, your development environments, and you see on the slide, there are some impressive performance improvements for a controlled shutdown with 2 million partitions, 2 million that's twice as good as Dr. Evil ever wanted. We are going from, it looks like 2 minutes 15 seconds in our measurements with ZooKeeper. 32 seconds with the quorum controller and on controlled shutdown, failure of a broker, we're going from 8.5 minutes to 37 seconds.

Tim Berglund:
So it's here. It is so exciting to see KIP 500 merged into a formal release. Again, development only. Watch this space for more, but it's here. This is a big deal. Kafka Connect. We have one KIP that's 661 exposing task configurations in the Connect rest API. So here's the deal. When you submit config to a connector, right? You're submitting new the poster, a put a that JSON documented the configuration information. Some connectors will, all connectors are going to pass that on to tasks. Some connectors might do some kind of computation on that. And so the config values that get passed down to the tasks might not be what you had in the JSON document. And that's fine, but you might want to look at what those config values is. I mean, it's Connect. Sometimes you want to look at these things. So now with this KIP, you've got an API where you can explicitly get the configuration information that the task actually believes it has rather than what you last told the connector.

Tim Berglund:
So nice improvement in management and visibility for all your connected troubleshooting needs. Kafka Streams. We've got a few KIPs to look at here. KIP 671 introducing the uncaught exception handler. Now exception management in a streams topology can be a little tricky. Remember this code is not really running in a place where you can just go try and catch things. You're building your topology and you're passing it off and executing it. And the Kafka Streams client. And it, it does its thing. So what do you do about exceptions? Well, if there's an uncaught exception, I can now register a handler and that's through this handy new interface called Streams uncaught exception handler and it's got a method called the handle to which the throwable in question is passed. When there's an uncaught exception inside that handler, you can decide what to do with that exception.

Tim Berglund:
And you can tell Streams to do one of three things. What it would have done is kill the task. That's, that's running that, that chunk of your streams, topology, those, those partitions. What it can do now is one of three things. It can shut down the client. That's the default. It can replace the threads. So let that thread died because there was an exception and create a new thread to take up that work. Or it can as Lord Denethor or might say, go and die in what way seems best to you. That's shut down application. And maybe that's a little severe, but hey, you might want to do that. So in any case, just a little bit cleaner way and more flexible way of handling uncaught exceptions. And speaking of threads dying in the Streams application, now we have KIP 663, an API to start and stop threads.

Tim Berglund:
So when a Streams application starts up, it has a configurable number of threads running. They can die and they don't come back. If and you might want to increase them to, to bring those dead threads back, you might have an increased workload and you might want to take advantage of cores that, that you're not, or whatever you might want more threads for any reason. So here now as an API where I can at runtime add threads or remove threads without having to bounce the application. So pretty handy for manageability of the Streams app. And as usual, there are more KIPs, I'm just giving you the highlights here out in the woods, and you want to check out the blog post, read the release notes that's always got all the details. And as always, I want to know what you're building. So download this release, get started with it. Please check out the KIP 500 functionality in development. That's super exciting stuff. Let us know how it goes.

Tim Berglund:
And there you have it. Hey, you know what you get for listening to the end? Some free Confluent Cloud. Use the promo code 60PDCAST—that's 60PDCAST, to get an additional $60 of free Confluent Cloud usage. Be sure to activate it by December 31st, 2021, and use it within 90 days after activation. Any unused promo value after the expiration date is forfeit and there are a limited number of codes available. So don't miss out. Anyway, as always, I hope this podcast was useful to you. If you want to discuss it or ask a question, you can always reach out to me on Twitter @tlberglund, that's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on a YouTube video or reach out on Community Slack or on the Community Forum.There are sign-up links for those things in the show notes. If you'd like to sign up and while you're at it, please subscribe to our YouTube channel. And to this podcast, wherever fine podcasts are sold. And if you subscribe through Apple podcasts, be sure to leave us a review there that helps other people discover it, especially if it's a five-star review. And we think that's a good thing. So thanks for your support, and we'll see you next time.