Skip to content


If you are not familiar with Kafka, follow this short tour. Kafka is a message broker in which you can produce (i.e. publish) and consume records. The standalone punch has a kafka running already. All you have to do is to create a kafka topic, and try producing and consuming record. We will do that using the handy standalone tools. First let us create a topic:

1 --create --kafkaCluster local --topic test_topic --replication-factor 1 --partitions 1

Each topic can be defined with some level or replication and number of partition. These are Kafka concepts. Next let us fill a topic with apache logs. To do that you must start apache_httpd and then inject some logs:

punchctl start --channel apache_httpd -c $PUNCHPLATFORM_CONF_DIR/resources/injectors/mytenant/apache_httpd_injector.json -n 100

Have a look at the apache_httpd_injector.json file. It is self explanatory. Remember the tool is extremely powerful and enables you to produce arbitrary data, that you can in turn send to kafka, elasticsearch, topologies etc..

Let us now check our messages are in our topic, as expected. You can again use the punch injector, but this time in consumer mode:

1 --kafka-consumer -topic mytenant_apache_httpd_archiving -brokers local -earliest

It should show your expected number of records. Try it also using -v.


Now that you are familiar with some of the most important concepts used in the PunchPlatform, let's try to create a channel. To create new channels you have two options. First you can refer to the spouts and bolts documentation, and write your own. A second options is to work with templates to ease the channel configuration files generations.

Here is how this second option works. To generate channel configuration files you need

  1. a channel high level configuration json file : in there you define only the most important properties of your channel. A typical example is the listening (tcp) port, the punchlets and the output elasticsearch cluster.
  2. template file to generate the detailed configuration files : these are .j2 jinja2 files, one for each required channel configuration file.

Trying Kafka

Have a look at the tenants/mytenant/etc/channel_config folder. There you will find the channel high level configuration json files.

Next have a look next at the tenants/mytenant/etc/templates folders. One (single) can be used to generate the example channels you just executed. The second (input_kafka_processing) is a variant to generate channels made of two topologies with a Kafka topic in between.

# make sure you are at the top of your platform configuration folder

# Stop your channels
punchctl -t mytenant stop

# Re-generate your apache channel using the input_kafka_processing template
punchctl configure --profile=input_kafka_processing tenants/mytenant/etc/channel_config/apache_httpd_channel.json --override

# ready to go, restart your channel
punchctl -t mytenant start --channel apache_httpd

# inject some data -c $PUNCHPLATFORM_CONF_DIR/resources/injectors/mytenant/apache_httpd_injector.json

Go have a quick look at the channel generated files. You should find out easily that your channel is now composed of two topologies, the first one pushing the data to a Kafka topic, the second one consuming that topic to parse the logs and insert them into elasticsearch. An easy way to visualise this new setup is to visit the Storm UI on http://localhost:8080. You should see your two topologies.


in the tenants/mytenant/channels/apache_httpd folder, have a quick look at the channel_structure.json file. This is the one that defines the overall structure of your channel. Compare it to the sourcefire original channel.

This concludes our 10 minutes tour, in order to come back to the original single channel layout, simply type in :

5 --stop mytenant
# the -f option is to force the generation without asking you to confirm -f \
  --configure tenants/mytenant/etc/channel_config/apache_httpd_channel.json \
  --profile single