Using Event-Driven Design with Apache Kafka Streaming Applications ft. Bobby Calderwood Artwork

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Hi, we’re Tim Berglund, Adi Polak, and Viktor Gamov and we’re excited to bring you the Confluent Developer podcast (formerly “Streaming Audio.”) Our hand-crafted weekly episodes feature in-depth interviews with our community of software developers (actual human beings - not AI) talking about some of the most interesting challenges they’ve faced in their careers. We aim to explore the conditions that gave rise to each person’s technical hurdles, as well as how their experiences transformed their understanding and approach to building systems.

Whether you’re a seasoned open source data streaming engineer, or just someone who’s interested in learning more about Apache Kafka®, Apache Flink® and real-time data, we hope you’ll appreciate the stories, the discussion, and our effort to bring you a high-quality show worth your time.

All Episodes

Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Using Event-Driven Design with Apache Kafka Streaming Applications ft. Bobby Calderwood

April 21, 2022 • Confluent, original creators of Apache Kafka® • Season 1 • Episode 210

0:00 | 51:09

What is event modeling and how does it differ from standard data modeling?

In this episode of Streaming Audio, Bobby Calderwood, founder of Evident Systems and creator of oNote observes that at the dawn of the computer age, due to the fact that memory and computing power were expensive, people began to move away from time-and-narrative-oriented record-keeping systems (in the manner of a ship's log or a financial ledger) to systems based on aggregation. Such data-model systems, still dominant today, only retain the current state generated from their inputs, with the inputs themselves going lost. A converse approach to the reductive data-model system is the event-model system, which is enabled by tools like Apache Kafka®, and which effectively saves every bit of activity that the system generates. The event model actually marks a return, in a sense, to the earlier, narrative-like recording methods.

To further illustrate, Bobby uses a chess example to show the distinction between the data model and the event model. In a chess context, the event modeling system would retain each move in the game from beginning to end, such that any moment in the game could be derived by replaying the sequence of moves. Conversely, chess based on the data model would save only the current state of the game, destructively mutating the data structure to reflect it.

The event model maintains an immutable log of all of a system's activity, which means that teams downstream from the transactions team have access to all of the system's data, not just the end transactions, and they can analyze the data as they wish in order to make their own conclusions. Thus there can be several read models over the same body of events. Bobby has found that non-programming stakeholding teams tend to intuitively comprehend the event model better than other data paradigms, given its natural narrative form.

Transitioning from the data model to the event model, however, can be challenging. Bobby’s oNote—event modeling platform aims to help by providing a digital canvas that allows a system to be visually redesigned according to the event model. oNote generates Avro schema based on its models, and also uses Avro to generate runtime code.

EPISODE LINKS

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
👍 If you enjoyed this, please leave us a rating.
🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.

Kris Jenkins: (00:00)
Are event modeling and data modeling, the same thing? Does the distinction even matter and why should we care? I'll be discussing all this and more with this week's guest Bobby Calderwood of Evident Systems. But before we start, let me tell you that The Streaming Audio podcast is brought to you by Confluent Developer, which is our site that teaches you everything about Kafka, from how to start it running and write your first app to architectural patterns, performance tuning, maintenance, all that good stuff. Check it out at developerdotConfluent.io. And if you want to take one of the hands-on courses you'll find there, you can easily get Kafka running using Confluent Cloud. Sign up with a code PODCAST100 and we'll give you an extra $100 of free credit to get you started. And with that, I'm your host Kris Jenkins. This is Streaming Audio. Let's get into it.

Kris Jenkins: (00:49)
So my guest today on Streaming Audio is Bobby Calderwood, who is the founder of Evident Systems and builder of oNote. Are you the main builder of oNote or are you one of a team?

Bobby Calderwood: (01:06)
One of a team now. Back in the old days, it was just me. But now we've got a team, a great team around me.

Kris Jenkins: (01:11)
Oh, cool. And you're also a closureist, I gather, which gives us something in common.

Bobby Calderwood: (01:16)
Oh, that's great! Yeah, we can chat closure a little bit, too. Yeah.

Kris Jenkins: (01:19)
Yeah, I might derail all the interviews for that. But oNote is kind of your central focus right now, right?

Bobby Calderwood: (01:26)
Yeah. So, oNote's our product. That's what we're a hundred percent focused on and we're excited. We actually released some features at Kafka Summit, which I'm sure we'll talk about. So we have a longstanding partnership with Confluent. We love working with you all. So...

Kris Jenkins: (01:38)
Ah, cool. Okay. It's a shame we haven't met before, but let's dive into your brain right now. So one note... Sorry, oNote. I mustn't get confused with that. oNote, it's an event modeling system, right?

Bobby Calderwood: (01:53)
That's right.

Kris Jenkins: (01:55)
You're going to have to step right back for me there because when I think of event modeling, I think, "Well, I do data modeling," which I do every time I write software. So is there a difference between data modeling and event modeling? Why do I need a tool for it? Give me the background.

Bobby Calderwood: (02:11)
Yeah, absolutely. So I've been helping teams build event-driven systems for a long time. Before I started Evident Systems, I was at Capital One building banking systems and banking systems are sort of by their nature event-driven, right? You've got transactions, which are these sort of fundamental events, and then you roll up these transactions into like account summaries. Like, okay, here's your balance. Based on the sequence of events that we've observed, here's the balance of your checking account or whatever.

Bobby Calderwood: (02:36)
So finance has this sort of naturally event-driven domain and that hasn't gone away, but for a lot of other industries, and certainly partly in finance, humans moved away from this sort of record keeping system that was mostly like time oriented, right? You think of ships' logs and financial ledgers and journals and all of these things. These are sort of fundamentally like event oriented, right? Something happened at a particular point in time, I want to make some observation about it. Scientific measurements are like that, right? I'm going to take the barometer at this time every day and record this thing. So most of human record keeping for most of history has been like that. It's only during times of extreme sort of resource scarcity where we decide, you know what? Let's throw away all the historical stuff and just capture the aggregates or just capture the summaries. And we started doing that at sort of the dawn of the computer age, when things were very expensive. Memory's expensive, CPU's expensive. So that's where I think in our mindset, we shifted away from thinking about time as sort of... yeah, and the narrative, the story as the fundamental construct of our records and we started thinking about the data model. We started thinking about the snapshot, the current state as the primary thing.

Bobby Calderwood: (03:46)
And so we've got great tools and great practice and great sort of engineering built around data modeling and that doesn't go away with event modeling. We certainly still do that. Event modeling is sort of re-enshrining the narrative, the business narrative as the key aspect of your system. So now you're going to start thinking first in terms of what happens. What are the key things in the course of serving my customer that occur, right? They tell us some stuff, we make some decisions internally, we provide some service to them, and we're just going to like record that. We're going to record the narrative of what our customer asked us for, what we choose to do about it, how we resolve that request from our customer. Just recording those key business outcomes is sort of the heart of event modeling.

Kris Jenkins: (04:34)
Just the facts, man.

Bobby Calderwood: (04:35)
Yeah. Just the facts, exactly. And then we can derive. From those facts, we can derive what the current state of the world is at any point in time. So like if you think about a chess game, data modeling is how do you model the board as a data structure. And as I move pieces around, how do I destructively mutate that data structure to reflect the current state of the system?

Bobby Calderwood: (04:56)
Event modeling is all about, hey, let's just capture the moves, right? Just capture the moves in chess notation. Pawn to... I'm not very good at chess.

Kris Jenkins: (05:04)
Queen to pawn three. Whatever, right, right, right.

Bobby Calderwood: (05:05)
Yeah. Pawn three, whatever, exactly. So let's just capture those moves. With computers, that's really easy to derive the board state from the moves, right? Because we know what the initial state is, we know what the sequence of transitions in this sort of finite state machine is. So we can easily compute the current board state, but it changes the modeling on its head a little bit, right? We still need to model the board, we still need to know what data structure we're going to use to sort of efficiently model the board. That's important, but that's not the primary concern anymore in our event modeling perspective of the system.

Bobby Calderwood: (05:35)
Now the primary concern is how do we model these events? How do we enable the user to introduce change into that event? Like how do we allow them to move a piece? How do we derive the board state from the sequence of moves, right? So you have commands. Commands, which are usually blue sticky notes and event modeling. Commands are how the user introduces change in the system. "Hey, I want to move this piece." Event is the recording of that change. We can say, "Hey, sorry. That's not a legal move." And that command never results in an event, right? So we just say, "Hey, sorry."

Kris Jenkins: (06:06)
Okay. Right.

Bobby Calderwood: (06:08)
So the command is the user's request for change. The event is the record of that change itself. So events are usually by orange sticky notes. Green sticky notes represent read models, which are aggregations of state, right? This is how we're kind of summarizing what's happened in the story so that we can present to the user and inform them about what's going on. And then in event modeling, we also model the user's experience. So we model actually what the user sees. The screens and the forms and all that stuff that the user sees in the user interface. We model that as well. So it's a very simple modeling discipline. There's always sort of four elements to this event modeling. But with it, you sort of capture the entire interaction of this information system and you capture all the important bits.

Kris Jenkins: (06:52)
Right. You're mentioning sticky notes. So this is something you would generally do on a whiteboard, in a team, physically?

Bobby Calderwood: (07:00)
Yeah. So that's historically how it's been done is on a whiteboard or a big piece of butcher paper or whatever on the wall. I've done a lot of physical event modeling, meet space event modeling with sticky notes and Sharpies and that whole thing. And that's great and that was sort of how it was originally envisioned. I didn't invent this. This was invented by Adam Dymitruk as a modeling discipline. It shares some sort of philosophical lineage from event storming and domain driven design and some of those other modeling disciplines that are out there, but event modeling is sort of the best thing I've found for designing event-driven systems and envisioning our software and information systems as stories, as narratives.

Bobby Calderwood: (07:43)
So yeah, we did a bunch of that stuff with sticky notes and all that back in the good old days. We found that the artifact left a lot to be desired. The process was really great because you're sort of sharing, learning. You've got your business stakeholders in the room and your technical stakeholders, you've got your graph designers and UX people. So everyone's in the room and everyone's learning a lot about how we serve our customers in the business process and all of that. But then at the end of this exercise, usually it's like a one day exercise, you're left with this giant piece of butcher paper covered in sticky notes. And now you've got to like try to move that into the developer's space and all this sticky notes fall off.

Bobby Calderwood: (08:21)
But the artifact leaves a lot to be desired and we found it and this is the reason we built oNote was, man, it would be great if we had a digital canvas for supporting event modeling. Something that was structured so that you get feedback if you're new to event modeling and you just have kind of a whiteboarding tool, like Miro or one of the really great collaborative whiteboarding tools. You don't get any feedback on am I doing this right because it's sort of an empty canvas, you have infinite degrees of freedom.

Bobby Calderwood: (08:47)
So we wanted to make a constrained sort of purpose built canvas for doing event modeling and then, we wanted to, behind-the-scenes, have a data structure because the event model really represents your intention for the system that you're going to build. So from this data structure, we can start generating code, we can output project management artifacts. Like we can just slice this diagram up into the constituent developer tasks and now you don't have to have a separate sort of user story generation ritual, where we all sat around JIRA. We can just sort of mechanically derive like here's the set of work we need to do because here's the system we're building. And we can now mechanically derive the tasks from the model. We can integrate with running systems and actually see like, okay, in production, what are we seeing? How many checkout events per minute are we getting? And we can sort of start to use the blueprint of the system which we've built with event modeling and use it not just during design, but throughout implementation with code generation and project management and into operations, where we're actually envisioning real data as it's happening.

Kris Jenkins: (09:49)
Okay. Because I'm always wearing my programmer's hat and that's makes me want to get into the nitty gritty of how it affects programmers. But I'll try and be patient. Step back, right? Because there are a lot of different people at this whiteboard and take me through how the... Because I can see how a programmer thinks about data and events and modeling, but what about the product manager coming to this system? How do they play out?

Bobby Calderwood: (10:15)
Yeah. So that's been the most interesting thing, sort of the anthropology behind doing these event modeling events has been-

Kris Jenkins: (10:23)
[inaudible 00:10:23].

Bobby Calderwood: (10:23)
Yeah, I know it is! It's really interesting because for the most part, product people and business people still think of their processes as stories, right? Their brains haven't been corrupted by the computers like the programmers have, right? So we programmers have sort of intentionally like sort of damaged our own brains to think in terms of current state, and data model, and third normal form, and database tables, and all that kind of stuff. for the most part, product people and business stakeholders are relatively free of that. They still think in terms of, "Okay, this happens, and then this happens. And if this happens, then we do this. But if this other thing happens, then we do this." They're still thinking in terms of the story and the narrative and sort of the causal flow. Which is, frankly, sort of the natural way of thinking about things, right? That's the natural narrative oriented way that the human brain is designed to work.

Bobby Calderwood: (11:13)
It's really at that intersection between product stakeholders and technical stakeholders where we have this sort of impedance mismatch where we're like, "Okay, so what does that mean about this field and this table and this database? And is this an integer field or is this a string field?" It's at that level that the conversation currently sort of breaks down and you're kind of like, "Well, now we have to think about things." And some of that's leaked upstream to product. Products people start thinking in terms of database, too, because we've trained them to do that as programmers. But we find that this event modeling actually facilitates a much more natural conversation with our product stakeholders, right? Where we can talk about cause and effect, we can talk about timeline, we can talk about alternative cases. Okay, well, if this happens, now we actually have a different flow that we should model down here on the board somewhere later. So yeah, it ends up being a very productive conversation. In my experience with product people who do event modeling just love it. Like completely love what it makes possible.

Bobby Calderwood: (12:10)
The other thing is really interesting for product people is that app and service developers who tend to be sort of kind of laser focused or maybe myopically focused on the transaction processing concerns forget that there's this whole world like downstream of transaction processing, where you're thinking about analytics, you're thinking about audit, you're thinking about kind of all the crosscutting concerns, right? Somebody somewhere wants access to this data, right? This data that the transaction processing teams are capturing kind of at the edge represents ground truth, right? For a lot of the analytics and stuff that we're trying to do behind-the-scenes, this represents ground truth. Currently, the transaction processing teams are letting everybody else down, right? Because the transaction teams are like losing a bunch of the narrative and they're only writing down certain aspects of it. And they're writing it down in this highly opinionated kind of third normal form of transaction processing centric kind of way.

Kris Jenkins: (13:05)
Yeah, yeah. We kind of get this stream of fact coming in and we go, "Oh, okay, here's a fact. Let me draw my conclusion from it and throw the fact away."

Bobby Calderwood: (13:13)
That's exactly what happens! Not realizing that like 20 people behind me want that fact. Like they want to know when a user does X so that we can make sure we're doing it legally and in compliance with laws and regulations for audit. For marketing, they certainly want to know what's happening because we want to understand our customer's behavior so we can serve them better and make new products for them and all of those sorts of things. So there's lots of people behind the transaction processing team who want these data, but the transaction processing team's like, "No. We're just going to write down our conclusions and then you can ETL out of our database once a week and put it in the data lake. And that's how you're going to run your analytics." And all these people behind-the-scenes are kind of like, "Oh, that's not fair. Like you get first class access to this data and we get second class access to it. And it's got at your fingerprints all over it. By the time we see it, it's got your opinions kind of all embedded in it," right?

Bobby Calderwood: (13:59)
So that's a bad state of things, right? You've got products like Kafka, you've got kind of the revolution that Confluent and Kafka are offering to the back office folks with all these analytics, and data in motion, and streaming, and all this stuff. None of that's going to matter if you're getting bad data at the edge, right? If you're getting stuff that has the transaction processing concerns opinions written all over it and some of it we've lost, that degrades the whole experience for everyone downstream. The data that is in motion behind-the-scenes isn't going to be as high quality as if the transaction processing teams got on the same page and started thinking about things event first, started thinking about data in motion. Right from the edge, right where we observe reality about what's going on outside our organization, right where we're capturing our customer's intentions kind of at that transaction processing layer.

Bobby Calderwood: (14:51)
That represents this sort of cell membrane, right? If everything back in the organization is the nucleus and all the important stuffs going on inside the company, that cell membrane represents our interaction with the outside world and that's the transaction processing stuff. And so event modeling's really an effort to get the transaction processing folks, the app and service devs to start building apps and services event first. Start conceiving of these things and writing down the stream of events about reality that they're observing. And then people downstream have the choice to either accept their read models, which are sort of their summaries of state. Or to be like, "I don't care about your read model. I just want to look at the facts and I'm going to come to my own conclusions about that same set of facts that you observed, transaction processing team, and we'll go from there."

Kris Jenkins: (15:44)
Yeah. Do you know what it really reminds me of? Is years ago I was a Java programmer and I started learning closure. And the thing I found hardest was not like the weird syntax or the different libraries. The thing that really I struggled with was moving from thinking about objects, which are just these records of state, to just data and transforms over them. And that mindset is really hard to shift from.

Bobby Calderwood: (16:09)
Yeah, yeah, yeah. Absolutely. I wrote an article about this some time ago, I think on the Confluent blog actually. But there's sort of that object thinking and that sort of current state oriented thinking really is pervading kind of the microservices that are proliferating now. Everyone's trying to build these like tiny objects that sort of communicate synchronously with each other and try to convey state to each other. If you look at things through more of a functional programming lens, you can just say, "Hey, no." We have this immutable log of facts that we're observing. We can write a transducer over those facts and aggregate those things up into some aggregate or some summary and say, "I've drawn this conclusion from this set of facts. I haven't changed or mutated or altered or lost any of those facts. You can still go and observe those facts yourself. And here's the conclusion that I've reached." So that's exactly events and read models.

Bobby Calderwood: (17:05)
And so the great thing about our modern... Martin Cutman wrote a great paper about this, too, this sort of online event processing idea. That's exactly what we're advocating here is at the transaction processing level, instead of getting requests from the outside, thinking about it really hard, throwing away the request and writing down our conclusion, let's record that conversation we're having. Let's record the facts that we're observing about reality first, then derive our own sort of transaction processing lens on what the current state of the world is so that we can continue to do our important transaction processing work, right? A lot of times you need a read model in order to assert invariance about uniqueness or whatever. Like we don't want to have 10 users with the same email address, so we need to capture sort of an aggregate of state at the edge to make sure that we're asserting those in variants.

Bobby Calderwood: (17:58)
But those are not the only conclusions that our organization wants to draw from those facts and so sort of transparently writing those facts down, shipping them back to our back office teams on something like Kafka Topic, now we've sort of unlocked... We're doing our important work at the edge still. We're still coming to our own conclusions and so forth, but we're allowing everyone else downstream first class access to the same data that we used to draw those conclusions. So now everyone's kind of singing from the same sheet of music. You're going to have much better data quality downstream.

Bobby Calderwood: (18:32)
Furthermore, you can leverage the fact that there isn't just one database anymore, right? Now we have a database for metadata and kind of the transaction processing concerns, but we may have a full text index on that same data stream. When you think about your systems event first, even at the edge just at the transaction processing level, now we can start imagining several different read models over the same set of facts, right? Several different sort of aggregations over the same set of events that we've observed for different purposes. So we can build a full text index in Lucine or ElasticSearch or whatever for the full text bits. And we can have our kind of read models that we serve out to our customers to inform them about what the system's doing, what the current stated system is, and we can have that in whatever Redis, or some RDBMS or whatever, PostgreS. But then we can also have maybe a blob store for the big images that our customers send. So there's not the database anymore. Now there's sort of the databases and we can put everything in its right place because we're just building these little transducers over these streams of events in order to-

Kris Jenkins: (19:39)
For people that aren't closure people, a transducer is like a reduced function or a fold or a-

Bobby Calderwood: (19:44)
That's right.

Kris Jenkins: (19:44)
... Kafka Stream roll up. Right.

Bobby Calderwood: (19:46)
That's right. Yeah. So in Kafka Streams' world, it's aggregate, right? You have the aggregate in the high up API. But yeah, it's exactly that. It's just a function that takes in many things and then returns one thing. So that's how you aggregate upstate. Some [crosstalk 00:20:02]

Kris Jenkins: (20:02)
And in our streaming world, it made it more than one time, right?

Bobby Calderwood: (20:05)
Yeah, yeah. Exactly.

Kris Jenkins: (20:06)
More many things come in and a new one thing comes out.

Bobby Calderwood: (20:09)
Yeah, that's right. Or you can envision the sequence of many things as being unbounded, where many things can arrive at different points in time. As each thing arrives at the door, we're going to incorporate that new thing into our single thing and now we have a new observation about the sum of some stream of numbers or something, right? Each number that are arrives that affects the sum and we can actually compute the sum as of the point in time where each of these individual numbers arrives to be summed.

Kris Jenkins: (20:41)
Yeah. So you've got different ways of looking at the same stream of recorded data and that kind of naturally maps. The thing I always think about is like if you're a startup, you're really worried about registrations. And you've got people registering on the site and someone sends 20 failed registration requests that you reject, and then the 21st one goes through and you record one fact that they registered. Whereas if you actually just recorded the fact that they tried 21 times to get the job done, the UX team would realize you've got a serious problem, right?

Bobby Calderwood: (21:16)
Yeah. That's exactly right. So the sort of separation or bifurcation of transaction processing concerns from analytic concerns to answer precisely those kinds of questions. That bifurcation starts to go away, right? You can go back to the origin of this idea of OLTP versus OLAP and a lot of that was just driven by the limitations of database systems at the time, right? We're like, "Okay. Well, we can't do long running analytics queries on our database because that'll tie up new registrations from coming in the door." So now, we bifurcate those concerns and we sort of snapshot that database off to this special analytics space. And now not only do we have two operational systems that we have to maintain, now we have two completely different career tracks. We've got software engineers and we have analysts, or data scientists, or whatever. And now like that bifurcated back in the '90s based on sort of the limitations of databases in the day. And now, there's completely different departments that manage those things. I mean, the distinction is totally artificial, right?

Kris Jenkins: (22:12)
Yeah. I mean, it's Conway's Law working in both directions, right?

Bobby Calderwood: (22:16)
Yeah, yeah! That's exactly Conway's Law. And you're sort of like we made that decision at some point in time and now like we've poured concrete all over it, now we've got different recruiters and different job titles for these different... But we're really just observing facts, recording them in software systems, and then making conclusions based on those facts, right? We're all doing the same types of work, but now we have these two different career tracks for doing that. When you start thinking about your systems event first, you can just build different summaries of the same set of facts that you're recording for different data access patterns, different purposes.

Bobby Calderwood: (22:47)
And so spinning all the way back to our product owners, product owners love this because when you're a startup and you're just sort of laser focused on the initial user experience. Say I'm making a coffee shop. I'm really laser focused in on my checkout process for how we take orders because that's what's going to keep my startup afloat and get the next round of funding and all that stuff, right? So laser focused on that. But behind-the-scenes, we've got all these, these other concerns. You've got the baristas fulfillment app. On the flip side, away from the customer, you've got baristas back there making drinks and they need to be able to see the same set of events, but summarize slightly differently so that they can fulfill the drink orders. And then behind that, you've got managers of each individual coffee shop location who care passionately about the inventory. And each time a barista makes a drink, it draws down on the inventory of milk, and coffee beans, and whatever else. And those inventory levels are something that I care about because supply chains really tough right now and I need to make sure I order in time to get coffee in from wherever.

Bobby Calderwood: (23:55)
And so that same set of transaction processing concerns bleeds back into the back office operations of your company and we really care about these things. If we were responsible as transaction processing devs, we would take that into account. We would write down the facts that we're observing and then allow others behind-the-scenes, like our poor manager, to take the set of facts from both the checkout app and the barista's app and say, "Great. This is what our inventory levels are doing based on those set of facts that you're recording, observing about reality recording, and then conveying back to me so that I can make those set of conclusions that I need to come to as the manager." Which are different and probably not even envisioned by the set of concerns that the transaction processing devs cared about initially, but because they did things the right way and wrote down things and envisioned their systems in terms of business events, you can come to all these different conclusions.

Bobby Calderwood: (24:48)
So that allows product owners to say, "Hey, we have a new requirement. We just realized that we need to know X." And now X is a really easy question to answer. If you've done sort of event modeling and built your system event first, it's really easy to add that next feature. Perhaps it's a linear cost to add each additional feature, instead of the sort of like exponential cost that we see in most systems where the hundredth feature is 50 times more expensive than the second feature. We get the sort of exponential complexity growth in most systems. When you start thinking about your systems event first, and recording like the language of the system like [inaudible 00:25:23] talks about, and thinking about immutable data, and thinking about recording facts, now adding that hundredth feature's maybe no more costly than adding the second feature, right? You've got the sort of linear cost growth because you've got the right sort of primitives and abstractions in place to extend your systems over time.

Kris Jenkins: (25:40)
If I can risk starting a flame more on the internet, which is desperately easy to do.

Bobby Calderwood: (25:45)
Oh, absolutely.

Kris Jenkins: (25:46)
One of my slight bug bears of test driven development is at its worst, it can be, "What conclusion am I trying to get to," and only build enough system to get that one conclusion.

Bobby Calderwood: (25:58)
That's right.

Kris Jenkins: (25:58)
And you just throw away so much leading data that one day you're going to turn out to want and you hit that exact situation where the hundredth feature becomes nigh on impossible.

Bobby Calderwood: (26:07)
Yes. Yeah. Test driven development, I think, is symptomatic of a lot of the sort of big A, agile kind of way of thinking that took over. I think we had a bit of a pendulum swing from, "Let's do big design up front and we're not going to write a single line of code until we've hammered at every aspect of this design." And then, obviously that doesn't work because you learn things in the course of interacting with your system and your customers interacting with your system that you want to incorporate in a sort of iterative way. So the big design up front, which ends up being a straw man because most of those ideologies didn't have single big bang design things, right? But that swung all the way over to like no design front where it's like, "We're just going to start writing code. We're just going to start writing tests, and getting those tests to go green, and deploy software out in front of our users."

Bobby Calderwood: (26:55)
I think sort of the right place is somewhere in the middle where you sort of do just enough design up front and your design tool, oNote and event modeling in my view, is sufficiently flexible and sufficiently integrated in with your development processes that it's easy to change something. Where you go in and say, "Okay, we learned something. Before we change the code, we're going to go back and we're going to change the design. And then, we're going to re-derive some aspects of the code from the design. And then, we're going to implement the feature from there."

Bobby Calderwood: (27:21)
So now our design continues to reflect the reality of the running system. But we've taken the time to think through if we change this, who does this impact downstream? Oh, maybe this impacts anyone who's reading this topic. Maybe there's a new message type that they have to deal with and if we just change this willing nilly, we're going to break a hundred people in our organization, right? Let's not do that. Let's go through the process of designing first. Even if it's fast design, it's iterative, it's happening over the course of days and you're going to write the code and deploy it next week, you still need to pause and do that design so that you don't sort of paint yourself into a corner, like you often do with sort of these no design or emergent design systems. People talk about, "Well, the design's going to be an emergent property of our running code." And it's kind of like that's great, except mostly it doesn't. Mostly no design ever emerges and it's just like as pile of spaghetti and now it's really hard to integrate.

Kris Jenkins: (28:13)
Yeah. And you could've got the feedback faster and easier just by looking at the domain you're modeling first and just thinking it, right?

Bobby Calderwood: (28:20)
Yeah.

Kris Jenkins: (28:21)
You can do a lot of iterative design just by discussion and pen and paper, right?

Bobby Calderwood: (28:25)
Yeah, yeah, exactly. And conferring with your subject matter experts and even showing that... One of the things that I love about event modeling is that it is so visual. You've got the screens kind of right on the model. We have a Figma integration in oNote that allows you to surface your Figma designs. The stuff that the UX folks are working on diligently over in Figma, you can surface those of on the timeline in oNote and see, "Oh, this is when the user interacts with this particular form or whatever." So you can actually show users, you can do some user level testing just with your event model. Say, "Hey, you suddenly find yourself on this screen. This is kind of where you are at in the process. Do you know what to do? Or click on the button that you would push now to take the next step, right?" So you can do that level of user testing just with your design artifacts without having to go all the way through to production code. Yeah. So there is a lot of design and design feedback that you can do before you commit to running code or before you commit to an architectural decision that is maybe ill advised or whatever. Yeah.

Kris Jenkins: (29:27)
Yeah. Or before you spend three weeks figuring out what you could have spent figuring out in an hour, right?

Bobby Calderwood: (29:32)
That's right, that's right.

Kris Jenkins: (29:33)
Sometimes.

Bobby Calderwood: (29:34)
Yeah.

Kris Jenkins: (29:35)
What's that phrase? A few weeks of coding can save an hour of thinking.

Bobby Calderwood: (29:39)
That's right, that's right. I saw Vaughn Vernon post that recently.

Kris Jenkins: (29:43)
Oh, yeah. It's a classic.

Bobby Calderwood: (29:43)
He's a big domain driven design guy. Yeah.

Kris Jenkins: (29:45)
So how did you get into all this? Give me your background on this.

Bobby Calderwood: (29:49)
Yeah, yeah. So as I mentioned, I was at Capital One before I started this company. We were doing a ton of this sort of thing at Capital One with Kaka and [crosstalk 00:29:58]

Kris Jenkins: (29:57)
They're a big investment bank, right?

Bobby Calderwood: (30:00)
Big retail bank. So it's actually retail customer facing. They actually don't do any of the Wall Street stuff, which is sort of interesting. They're just credit cards, bank accounts, auto loans, that sort of thing. Great company, I loved working for Capital One. They were spectacular. Before that, I was working with Rich, and Stu, and the team at Cognitect, that is now Nubank.

Kris Jenkins: (30:21)
Oh, the closure company!

Bobby Calderwood: (30:22)
Yeah. The closure folks, yep. And I was working on Datomic and kind of the customer success side of Datomic, and writing sample apps, and helping customers sort of be successful with Datomic. And that was the first system that really got me thinking about this, making the jump over into functional programming as you described drone experience, and then thinking about data with this sort of time component as sort of a first class construct. Datomic's all about being able to see your database as of any particular transaction, right? So there's this real firm notion of a logical clock in your system and you can see what the state of the world was at a particular point in time.

Bobby Calderwood: (31:01)
The architecture of the system was this really great sort of CQRS event source system and it was really cool. And that architecture allowed us to do a lot of really neat operational things like cache queries within my Java process, right? Within my application, we had a query cache where we could actually serve a lot of query results from cache. Instead of going round trip across the network to ask for a query and maybe there's a cache behind the database someplace, we actually had caches in memory because it was all immutable data, and it was easy to cache, and easy to know when it's invalidated, and all that stuff. So really, really cool system. That was one of the great honors of my career was to work with Rich and Stu on Datomic.

Bobby Calderwood: (31:45)
And that got me thinking and sort of noodling about this is in the data domain. This is like thinking about read models, in effect. I didn't come to that term until later, but we're thinking about read models within Datomic. What if we sort of backed out one step and thought about our entire system this way? What if we thought about the business events that lead to these data consequences that Datomic takes care of? What if we think about the business events in the same way? Or we're recording the business events in response to user commands, we record these events, and then we aggregate state up into read model, into something like Datomic perhaps, to keep track of what the current state of the world is.

Bobby Calderwood: (32:24)
Anyway, that sent me down this sort of track about immutable data and facts and streaming and all that stuff. And it's come to mean a lot to me because I think this is the right way to build systems, right? And having done this the wrong way so many times in my career and gone down these horrible, wrong trails of trying to solve this problem, microservices that are sort of object oriented and chatting synchronously and all that stuff. You can't really reason about state in that kind of a system, right? When you have this distributed system where everything's all the time synchronous right now and we're just going to chat synchronously over HTTP or whatever, you can't reason about the state of the system in that world, right? Because there's no fixed point in time, there's no fixed point in that system. Whereas if you envision-

Kris Jenkins: (33:05)
Yeah. It becomes like a series of island arguing with each other, right?

Bobby Calderwood: (33:08)
Yeah! Yeah, yeah. Exactly. Exactly. And so you have these horrible like Death Star diagrams of system dependencies where it's just this chaotic mess of entanglements and dependencies. And when one thing goes down, it's a complex system in sort of the technical sense where you can't know what's going to happen, right? You get an unexpected input and like you can't know what the system's going to do in response to that input because it's a complex system.

Bobby Calderwood: (33:30)
When you simplify that down and you say, "No, we're going to talk about immutable data. We're going to talk about streams of events. We're going to divide up responsibility for keeping track of these things according to sort of these bounded contexts." Now you can start to reason about your system. At least within a bounded context, you have a logical clock of this event happens and this event happens and they each have a number. And so you can say, "Okay, as of this point in time, what did the state of the system look like?" And you can reconstruct the state of the world at that point in time. So it's a very important thing in terms of managing distributed systems. It's a very important sort of way of thinking in terms of serving customers because customers care about time, right?

Bobby Calderwood: (34:12)
When I was at Capital One, we had sort of this requirement for serving military members, right? So military person gets orders, they ship overseas to serve in the military. Some adverse financial thing happens in their bank account while they're away and hopefully focused a 100% on their mission. They come back and they're like, "Hey, you charged me a bunch of overdraft fees," or whatever. There's legislation in the United States that says no. As long as they can show you orders for that time period, you have to wave all of those fees and compensate them and all that stuff.

Bobby Calderwood: (34:44)
This was very difficult to do because we weren't thinking about it at the time in terms of time and a logical clock. The only way to solve that problem is to like assign a team of human beings to go in and try to figure out what happened and make changes.

Kris Jenkins: (34:58)
Which is very, very messy.

Bobby Calderwood: (34:59)
Yeah. And expensive and so forth, right? But if you had a system that said, "Okay, let's go back to this point in time, let's look at the state of the world. Now let's imagine an alternative future." Like here's the future from that point that we recorded, here's what happened. Let's imagine to ourselves in alternative future where this person had military orders on record at that point and now let's just run that future forward. Okay, great. Let's compare the imagined future to what we actually did, come up with the difference, cut the check to our customer and move on with our lives, right? It becomes a lot easier to sort of automate that thinking when you have the business event as sort of the fundamental construct of your system.

Kris Jenkins: (35:38)
Yeah. I mean, it all starts with recording what actually happened in the world and then doing something with it. Yeah.

Bobby Calderwood: (35:43)
Exactly, exactly. We're seeing resistance from a lot of software teams around that because it does take this extra step, right? It does take this extra discipline and a little bit of extra work for the transaction processing teams. But if the transaction processing teams commit to that and do that in the right way, the whole organization benefits a ton. Now, not a lot of those benefits necessarily are accrued by the transaction processing team who's bearing the cost. So we have a little bit of this sort of malincentive problem where the teams that can actually make this change that'll make everything better aren't incentivized to do so necessarily because it's unfamiliar and it's going to cost us. We're going to slow down and then we don't meet our metrics, and so now the boss doesn't get the bonus and all that stuff. So we've got this sort of bad incentive problem because a lot of the benefits that accrue to the organization don't necessarily accrue to the specific team that has to make the change and bear the cost.

Bobby Calderwood: (36:38)
So there's a really interesting sort of organizational incentive problem. That's precisely why we wanted to make oNote was this transition is hard. We know it's hard to make this leap from thinking about current state to thinking about streams of events. Let us give you some tooling that will help you make it easier, maybe even accelerate you a little bit. So that's really where we want to apply oNote and sort of event modeling and event-driven design thinking is apply it at the edge and help these teams reduce that cost, at least. Maybe if we can't help them accrue the benefits, at least reduce the cost of adopting this event-driven orientation, this event-driven mindset.

Kris Jenkins: (37:14)
Yeah. So what's the point in your life where you say, "Okay, this..." I mean, it's a good idea, but there must be a point in your life and you say, "I'm going to jump and create a tool that does this because I think it will help adopt that big idea."

Bobby Calderwood: (37:29)
Yeah. That's a great question. That's sort of the semi-unhinged sort of deranged mentality of the startup founder, right? Where you're willing do that. You see this opportunity and you're like, instead of being rational, being like, "That's a good idea. I'm going to go back to my job and make money." You're sort of like, "I'm going to dive in after that idea. Maybe my family will eat, maybe they won't," right? So yeah. There's...

Kris Jenkins: (37:52)
But at least they'll be happy.

Bobby Calderwood: (37:53)
Yeah. They will be happy, that's right. Yeah. We got to the point early at Evident Systems' history, we were doing professional services. So we were doing this big project with a big tax preparation service in the US. And they were going to sort of rewrite everything event first and they were just having the hardest time doing it, right? It was unfamiliar. There was resistance from the architects and the developers because they're like, "This is going to be too costly. We just need to get our job done." And we did some event modeling with them. It helped, it sort of unlocked... kind of broke the log jam in certain ways. But then there's still this resistance anyway.

Bobby Calderwood: (38:30)
So we were doing this work and we found that, "Hey, event modeling's awesome. I think event modeling could solve a lot of this problem, but we need a tool. Like we need something that's going to help guide teams and help teams to do this in a sort of self-serve basis so they don't have to hire expensive consultants like us to make this happen, right? They can sort of do this on a self-serve basis." So that was really the point in our world when we went from professional services, like we're going to help teams solve this one team at a time, to let's make some tooling that's going to make this easy for any given team to adopt this. And we had a great partnership with Confluent. We were like, "Hey, we know all these people, we can sort of integrate with their stuff and maybe help some of their customers, right?" So we had sort of a well-trodden path ahead of us. Confluent was leading the way, we're sort of drafting behind and saying like, "Hey, what Confluent's doing for the analytics world in the back office, we can help sort of push out towards those sort of teams at the edge, these software engineering teams."

Kris Jenkins: (39:28)
Right. And help shift mindsets around it.

Bobby Calderwood: (39:31)
Yes, exactly. And then, that makes everything back office better, right? It makes all data that's getting into the data and motion...

Kris Jenkins: (39:40)
Yeah. Because we have all these wonderful tools for storing this stream of facts and processing it, but if you don't get the stream of facts right, if you don't learn to capture that and get them down, then you've got this jet that you're using to drive to the shops and back.

Bobby Calderwood: (39:55)
Yeah. Exactly, exactly. And I think a lot of Kafka users feel that way. They're kind of like, "This is great. Like we've got this tool and we're just not getting the value out of it that we expected, right?" And I think the next like domino to fall needs to be these teams at the edge, these services' app teams.

Kris Jenkins: (40:18)
It's difficult. There's sort of a chicken and egg thing, right? You need the tools to be able to think that way and you need to think that way to be able to use the tools.

Bobby Calderwood: (40:24)
Yeah. Precisely, precisely.

Kris Jenkins: (40:26)
And we just sort of jiggle the feedback loop until things change in our world.

Bobby Calderwood: (40:30)
That's right, that's right. So yeah, it's an exciting time. I mean, we see this transition happening across the industry again with great companies like Confluent kind of leading the way and changing minds, we can at least have the conversation now. 5, 10 years ago, you try to have this conversation with the CTO and they're like, "You're crazy. Like we're not even going to consider that." Now, they're like, "Oh yeah, I get it. Like event-driven and streaming data. Like that makes sense to me." There are other great tools in the ecosystem, like EventStoreDB, which is a fantastic like app and service database at the edge, that's designed to store events and then you can let schlep those events back on Kafka to everyone behind-the-scenes kind of thing. So there's great tools, there's great development methodologies. It's just getting these tools and methodologies kind of into the mainstream, getting them in common practical use among dev teams.

Kris Jenkins: (41:17)
Getting them under fingers in the keyboard and...

Bobby Calderwood: (41:19)
That's right, that's right. So it's hard work, yeoman's work that we have to do, but that's the mission. We think that this'll really change software engineering for the better and free up developers from having to do a lot of the un-mucking and un-complexifying their systems, the stuff that we spend 80% of our time doing that's just not very fulfilling. Paying that cost to add the hundredth feature. It's not very fun, right? If we can free up these dev teams to really be focused on solving customer problems and really pleasing and delighting their product stakeholders, like everyone wins, everyone's happier. And that has this sort of benign effect on everyone else downstream that we've talked about. All these crosscutting concerns also get better data. So it really makes a lot of things better for everyone if we could figure out a way to incentivize these teams to pay that cost to re-envision how they design these systems and so forth. So...

Kris Jenkins: (42:12)
Yeah, yeah. And we get this thing where like when you are adding that hundredth feature and it's a misery, that's super painful. But on the other side, on the flip side of the coin, when you're adding the hundredth feature and it just goes in easily because your system was well designed from the start, you just feel so smart.

Bobby Calderwood: (42:30)
Yeah, you do.

Kris Jenkins: (42:30)
Really satisfying.

Bobby Calderwood: (42:32)
It's super satisfying, but I wonder if absent the pain, or the pain being forgotten, if you notice how great that is, right? Unless you've been adding the hundredth feature and really feeling that in your skin, how uncomfortable that is, I wonder if you don't realize how great it is to add the hundredth feature at a linear cost and everything just works. So that's the world that I envision in the future where developers don't even notice the cost of that hundredth feature because we've fully made this transition. And it'll only be us old people who are like, "Oh, back in my day, we had to do... That was horrible." So...

Kris Jenkins: (43:06)
Yeah, yeah. I'm sure there are plenty of artists who think the suffering is a necessary part of the development, but...

Bobby Calderwood: (43:14)
Sometimes it's like that way in startups, too. And one feature that we just announced at Kafka Summit recently that we're really excited about is code generation and that's one of those things that we like hypothesized. We're like, "Yeah, if we have the event model as a data structure, we should just be able to generate code." That was always a hypothesis when we were building the very first version of oNote, but we sort of got to that point in time for Kafka Summit.

Bobby Calderwood: (43:38)
My Kafka Summit talk this past Kafka Summit Americas, we talked about Avro code generation, right? Avro's used and beloved and hated within the Kafka and Confluent communities. And we found a way to integrate the development of both event schemas, the things that you'd actually register with the Confluent Schema Registry, as well as the kind of network web services protocol defined by the commands and read models. We could generate all that stuff in Avro, so the Avro schema files and the Avro protocol that basically represents your system. We could generate those set of Avro artifacts and then from the protocol, we can act actually generate runtime code. Right? There's a lot of tools in the Avro community for generating runtime code for Avro IPC or gRPC. That's a target that we hit with in Java was we could generate a gRPC app that uses Avro instead of Protobufs as its sort of serialization format. And then the system sort of generates itself, right? You sort of just plug it into your editor and it auto completes the interface code for you. And you're like, "Oh, okay, great. I have to implement this method and this method, and then the system works. That's really cool."

Bobby Calderwood: (44:46)
So we did some demo stuff. We've done some training and some sample apps since then that allow us to sort of bootstrap this Java web service and have a Java client that uses the same protocol to be able to basically speak to the service. So we have a thing written enclosure that's Java FX on the front end, and then in this web service in the backend that speaks gRPC, and the two can talk all based on the sort of artifact that we generated from oNote. So it's an exciting time. We have plans for a lot of other features like GraphQL and some of the other sort of web service definition languages, REST via OpenAPI Swagger, a couple of different things like that.

Bobby Calderwood: (45:29)
But we think if we can generate those from your design artifact, we're really going to accelerate the development cadence for these transaction processing teams and maybe reduce the cost or the bite of re-envisioning how they design these apps, right? If they're going to spend a little more time in design and thinking about the event model, but as a payoff from that investment in time, they're actually going to accelerate their development a little bit because they can generate appreciable parts of their system. So, that's sort of the trade off that we're dangling in front of these teams to say, "Hey, yeah. It's going to be hard to sort of adopt this mindset, but here's some great tooling and we'll accelerate you, so maybe you'll break even and it'll be fine. You'll still hit your goals, all that kind of stuff." So...

Kris Jenkins: (46:08)
Yeah. I think that's so important with tools like this, where you... I mean, we've all experienced a thing where there's an exciting new documentation tool and you do the documentation. And then you go into coding and it just becomes this dead artifact that never gets updated. There has to be a relationship between your modeling tool and your code or one will die.

Bobby Calderwood: (46:28)
That's right, that's right. [crosstalk 00:46:30] It's usually the documentation. Yeah. That's right. And it's usually the documentation or the design artifact that falls by the wayside because we do spend time in the code. So that's really our goal is to make the payoff for updating this design artifact sufficiently valuable. That there's a natural incentive to keep this updated so that it is a blueprint of the running system, not just during design, not just during the implementation, but throughout operations. Like we continue to get value from this artifact, so we continue to update it. That's exactly the sort of incentive structure we want to put forth.

Kris Jenkins: (47:00)
A living relationship between the two.

Bobby Calderwood: (47:01)
That's right. That's exactly right. Yep.

Kris Jenkins: (47:03)
So there was one more question I wanted to ask you about this because I think this is hard and interesting. You're designing oNote, you're designing it for a lot of different kinds of users, right? Developers, managers, product managers, CTOs. What did you do to try and address those different mindsets in an actual web tool?

Bobby Calderwood: (47:24)
Yeah. So, that's a great question and it's one that we're still answering. So from a marketing perspective and product dev perspective, we have our personas and we're thinking about what is this person thinking about when they're encountering our tool and all that kind of normal design thinking and stuff. We use event modeling in oNote in our own process, so we're actually using it.

Kris Jenkins: (47:45)
Good.

Bobby Calderwood: (47:46)
We're dogfooding our tool. We're kind of building a event models of the different interactions that we have in our system. But a lot of it is just sort of iterative. We talk to our user community, and we take it and put it in front of product folks and say, "Hey, what about this?" We're actually running pilots right now. Free pilots, so if anyone's interested who's listening to this, we're happy to do a pilot with your company for free. But that's exactly why we're doing the free... What we get out of doing a free pilot with a company is exactly answers to these questions where we can say, "Okay, the product owner on this team ran into this set of issues where they couldn't get past this mental block. The tool has to be able to address that for them. The engineers, on the other hand, had this set of concerns or this set of missing future or whatever." So, it's really just in reality where we're taking it out and putting it in front of users to get their feedback on it. That's where the rubber meets the road, that's where we try to add the value back in. So, yeah. We're in that process now, we're running some pilots with some really great companies that we're proud to be working with. And, yeah, if anyone in hearing of this podcast is interested, I'd love to run a pilot with your company.

Kris Jenkins: (48:53)
Is there a link? Do we put the link in the show notes? Or do we give them your Twitter handle?

Bobby Calderwood: (48:57)
Yeah, we can put a link.

Kris Jenkins: (48:58)
How do people find you for that?

Bobby Calderwood: (48:59)
Yeah. Twitter handle, we'll put a link in the show notes. There's just a little registration form that you fill out and then we get in touch with you. So...

Kris Jenkins: (49:05)
Cool. Well, that seems like a nice point to leave this story and I hope oNote is a great success coming forward.

Bobby Calderwood: (49:13)
Thank you, Kris. Appreciate you having me on the show.

Kris Jenkins: (49:15)
I appreciate you teaching me something about event versus data modeling because I must admit, before we started this, I thought they were basically the same thing. And now, I'm enlightened.

Bobby Calderwood: (49:25)
Awesome. Good stuff. Thanks, Kris.

Kris Jenkins: (49:27)
Thanks very much, Bobby. Pleasure talking to you.

Bobby Calderwood: (49:29)
Cheers.

Kris Jenkins: (49:29)
Cheers, bye. Well, I learned a lot from that and I hope you did, too. Before we go, let me remind you that if you want to learn more about event-driven architecture, we'll teach you everything we know at Confluent Developer. That's developer.Confluent.io. If you're a beginner, we have getting started guides. And if you want to learn more, there are blog posts, recipes, in depth courses, and more. If you take one of those courses, you can follow along easily by setting up a Confluent Cloud account. And if you do that, make sure you register with the code PODCAST100 and we'll give you a $100 of extra free credit. Well, and if you are already an expert in Kafka, well, get in touch and we'll have you on the show. For that or any other reason, do drop us a line. If you're listening to this, you'll find our contact details in the show notes. And if you are watching, there are links in the description and probably a comment box down below there. So use that. If you liked today's episode, then please give it a like, or a star, or a share, or a rating, or whatever buttons you have on your user interface to let us know you care. And with that, it just remains for me to thank Bobby Calderwood for joining us and you for listening. I've been your host, Kris Jenkins. We'll catch you next time. Thanks. (Silence)