Why should I use KStream or KTable?

A KTable is an abstraction of a changelog stream, where each data record represents an update. More precisely, the value in a data record is interpreted as an “UPDATE” of the last value for the same record key,

Kafka Streams is a library for building streaming applications, specifically applications that transform input Kafka topics into output Kafka topics.

KStream handles the stream of records. On the other hand, KTable manages the changelog stream with the latest state of a given key. Each data record represents an update in KTable.

KStreams are stateless whereas KTable is stateful.


I read that I can use KTable instead of log compaction

A KTable depends on a compacted state store topic. They are not exclusive options

Or it has many more features

Well, why would you use a cache or Hashmap? The same answer can be applied to a KTable. The extra feature is that it can be shared and distributed across multiple instances of your application


You can do more research on the "Stream-Table Duality".

example of ktable and kstream and what can I do?

A KStream is an audit log of all, or a filtered subset of events in the topic. It's hard to quickly pick out any given event

A KTable holds the most recent keyed event from a stream and allows for fast key lookups

A counter is the simplest example; You have a stream of events (say, words from hashtags)

#kafka is great
working with #kafka today
#streaming all the things

So, that's the stream. You then need to consume this stream into an aggregated table, parsing out and counting the hashtags, resulting in key value pairs

(kafka, 2)
(streaming, 1)

However, if you were to query the table immediately after the first event (before the table consumed the remaining events) you'd only see (kafka, 1).