Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Gently Down the Stream – A gentle introduction to Apache Kafka (2021) (gentlydownthe.stream)
88 points by warrenm on June 15, 2023 | hide | past | favorite | 14 comments


Discussed at the time:

I wrote a children's book / illustrated guide to Apache Kafka - https://news.ycombinator.com/item?id=27541339 - June 2021 (226 comments)

(Reposts are fine after a year or so and links to past threads are just to satisfy extra-curious readers!)


thanks - I didn't see it when I searched :)


Reposts are fine after a year or so! This is in the FAQ: https://news.ycombinator.com/newsfaq.html.

Links to past discussions are just to satisfy extra-curious readers.

Edit: oh wait I already said that. Never mind - I just wanted to make sure you got that it wasn't a reproach.


I’ve found this HN comment to be the single best explanation of the purpose of Kafka for engineers (like me) who never truly appreciated why you’d want it in the first place: https://news.ycombinator.com/item?id=35160555


Wow that’s such an amazing comment. And explains very well what kafka does and why it’s valuable from a technical perspective, but I wanted to add why it’s valuable from “corporate political” perspective. This is how I usually “sell” kafka to new recruits.

Imagine a big corporation - lots and lots of various teams all doing their own thing - business processes, auth, customer admin etc.

When all those micro or macro services want to talk to each other, you definitely don’t want them to all dip their fingers into one large shared database - you want some contractual guarantees between them - graphql / rest with openapi, that kind of thing. And it all works up to a certain scale.

But now imagine this intricate web of network requests … and then one service fails… This can cause a cascade that’s really hard to track.

Or … you have one very reliable service being used by various clients, but suddenly there is one more client that sends 10 times more requests than all the other clients combined - happens all the time in large corps.

And sure there are ways to mitigate both with technical or policy ways, but what kafka offers is a single, ready made solution for all of it.

If the data flowing through a corp does so through kafka, you end up with very strong contractual guarantees about shape of the data both for consumers and producer. Scaling your services becomes mostly a solved problem. Producers of data don’t care too much who or how many consumers they have handling 10x, 100x scale changes becomes if not trivial, at least fairly easy.

What that means in practice is you trade every team being experts in scale and deployment of their services, to needing one really good team to shepherd your kafka cluster and the rest don’t need to deal with it too much.

But hosting kafka at scale is quite tricky - haven’t done it myself, but knew the team that handled it at my previous org and those were top notch guys that still struggled with various things.

Anyway, kafka kinda allows for micro services to scale at big orgs, and that to me is just amazing.


> Imagine a big corporation - lots and lots of various teams all doing their own thing - business processes, auth, customer admin etc.

In the past, I worked at a large retailer that started with the large shared database. It was as messy, if not messier, than you can imagine. We rolled out kafka to address exactly these issues where lots of people were generating messages and lots of people needed to process some of them. It was still significant work to manage that infrastructure in a globally redundant environment, but far far far far more manageable for ops teams and data producers/consumers than the BIG DB ever would be.


Wow, thank you! Happy to hear it was helpful.


It really was, have shared it with many colleagues who also appreciated it!


> Gentle introduction

> Starts talking about scaling via partitions

I think this book could've been shortened or put more details around message creation and retrieval.


You could counter that the whole point of Kafka is scalability. There are many much more trivial solutions for message passing not at scale


That's a cute site.

The author forgot to say that teenage otters set `acks=0` when producing messages :D.


I still have nightmares about Kafka. I pulled out my hair for two weeks. That product was very appropriately named.


Very cute, thanks for sharing!


"Apache Kafka"...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: