Kafka¶

Kafka is a message broker in which you can produce (i.e. publish) and consume records. In the Punch, it is often used to store logs from one punchline to another, as in apache_httpd channel.

The Punch provides some command line tools to interact with your Kafka. In this chapter, we will use those commands.

Practice¶

The Standalone punch has a Kafka running already. We will explore the Kafka topic which is used in the apache_httpd channel.

Start apache_httpd channel and inject logs :

channelctl -t mytenant start --channel apache_httpd
punchplatform-log-injector.sh -c $PUNCHPLATFORM_CONF_DIR/resources/injectors/mytenant/apache_httpd_injector.json

The Kafka topic mytenant_apache_httpd_archiving is created when running the channelctl start. Indeed, it is a resources in the channel_structure.yaml of the apache_httpd channel.

Let's use Punch Kafka command to check this topic :

punchplatform-kafka-topics.sh --describe --topic mytenant_apache_httpd_archiving

This gives information about the replicas, retention configurations...

Let's use another Punch Kafka command to check who consumes this topic :

punchplatform-kafka-consumers.sh --list

There is a group mytenant.apache_httpd.archiving.kafka_input. This represents the kafka_input nodes in the archiving topology.

Let's check their current offset :

punchplatform-kafka-consumers.sh --describe --group mytenant.apache_httpd.archiving.kafka_input

This displays the lag between the messages in the queue and the messages consumed by the kafka_input node, as well as the current offset.

If we had any problem in the archiving topology, we can replay some data by changing the current offset. Changing the offset requires to stop the consumer.

Let's stop the archiving topology first :

channelctl -t mytenant stop --application apache_httpd/common/archiving

Now we can change the offsets the replay the latest 100 logs :

punchplatform-kafka-consumers.sh --shift-offsets -100 --group mytenant.apache_httpd.archiving.kafka_input

Restart the archiving topology :

channelctl -t mytenant stop --application apache_httpd/common/archiving

As a result, the last 100 logs will be archived again.

All you have to do is to create a Kafka topic, and try producing and consuming records.

Kafka¶

Practice¶

Resources¶