What is the difference between Apache kafka vs ActiveMQ

Solution 1:

Kafka and ActiveMQ may have some overlaps but they were originally designed for different purposes. So comparing them is just like comparing an Apple and an Orange.

Kafka

Kafka is a distributed streaming platform with very good horizontal scaling capability. It allows applications to process and re-process streamed data on disk. Due to it's high throughput it's commonly used for real-time data streaming.

ActiveMQ

ActiveMQ is a general-purpose message broker that supports several messaging protocols such as AMQP, STOMP, MQTT. It supports more complicated message routing patterns as well as the Enterprise Integration Patterns. In general it is mainly used for integration between applications/services especially in a Service Oriented Architecture.

Solution 2:

Kafka Architecture is different to ActiveMQ.

In Kafka, producer will publish messages to topic, which is a stream of messages of a particular type. Consumer will subscribe to one or more topics of brokers by pulling the data.

Key differences:

  1. ActiveMQ Broker had to maintain the delivery state of every message resulting into lower throughput. Kafka producer doesn’t wait for acknowledgements from the broker unlike in ActiveMQ and sends messages as faster as the broker can handle. Overall throughput will be high if broker can handle the messages as fast as producer.

  2. Kafka has a more efficient storage format. On average, each message had an overhead of 9 bytes in Kafka, versus 144 bytes in ActiveMQ.

  3. ActiveMQ is push based messaging system and Kafka is pull based messaging system . In AcitveMQ, Producer send message to Broker and Broker push messages to all consumers. Producer has responsibility to ensure that message has been delivered. In Kafka, Consumer will pull messages from broker at its own time. It's the responsibility of consumer to consume the messages it has supposed to consume.

  4. Slow Consumers in AMQ can cause problems on non-durable topics since they can force the broker to keep old messages in RAM which once it fills up, forces the broker to slow down producers, causing the fast consumers to be slowed down. A slow consumer in Kakfa does not impact other consumers.

  5. In Kafka - A consumer can rewind back to an old offset and re-consume data. It is useful when you fix some issue and decide to re-play the old messages post issue resolution.

  6. Performance of Queue and Topics degrades with addition of more consumers in ActiveMQ. But Kafka does not have that dis-advantage with addition of more consumers.

  7. Kafka is highly scalable due to replication of partitions. It can ensure that messages are delivered in a sequence with in a partition.

  8. ActiveMQ is traditional messaging system where as Kakfa is meant for distributed processing system with huge amount of data and effective for stream processing

Due to above efficiencies, Kafka throughput is more than normal messaging systems like ActiveMQ and RabbitMQ.

More details can be read at notes.stephenholiday.com

EDIT: It's especially for the people, who thinks producer does not wait for confirmation of acknowledgement from broker can read ActiveMQ documentation page

The ProducerWindowSize is the maximum number of bytes of data that a producer will transmit to a broker before waiting for acknowledgment messages from the broker that it has accepted the previously sent messages.

Solution 3:

I hear this question every week... While ActiveMQ (like IBM MQ or JMS in general) is used for traditional messaging, Apache Kafka is used as streaming platform (messaging + distributed storage + processing of data). Both are built for different use cases.

You can use Kafka for "traditional messaging", but not use MQ for Kafka-specific scenarios.

The article “Apache Kafka vs. Enterprise Service Bus (ESB)—Friends, Enemies, or Frenemies? (https://www.confluent.io/blog/apache-kafka-vs-enterprise-service-bus-esb-friends-enemies-or-frenemies/)” discusses why Kafka is not competitive but complementary to integration and messaging solutions (including ActiveMQ) and how to integrate both.