Skip to content

Kafka

If you are not familiar with Kafka, follow this short tour. Kafka is a message broker in which you can produce (i.e. publish) and consume records. The standalone punch has a kafka running already. All you have to do is to create a kafka topic, and try producing and consuming record. We will do that using the handy standalone tools. First let us create a topic:

1
punchplatform-kafka-topics.sh --create --kafkaCluster local --topic test_topic --replication-factor 1 --partitions 1

Each topic can be defined with some level or replication and number of partition. These are Kafka concepts. Next let us fill a topic with apache logs. To do that you must start apache_httpd and then inject some logs:

1
2
punchctl start --channel apache_httpd 
punchplatform-log-injector.sh -c $PUNCHPLATFORM_CONF_DIR/resources/injector/mytenant/apache_httpd_injector.json -n 100

Have a look at the apache_httpd_injector.json file. It is self explanatory. Remember the punchplatform-log-injector.sh tool is extremelly powerful and enables you to produce arbitrary data, that you can in turn send to kafka, elasticsearch, topologies etc..

Let us now check our messages are in our topic, as expected. You can again use the punch injector, but this time in comsumer mode:

1
punchplatform-log-injector.sh --kafka-consumer -topic mytenant_apache_httpd -brokers local -earliest

It should show your expected number of records. Try it also using -v.

Templates

Now that you are familiar with some of the most important concepts used in the PunchPlatform, let's try to create a channel. To create new channels you have two options. First you can refer to the spouts and bolts documentation, and write your own. A second options is to work with templates to ease the channel configuration files generations.

Here is how this second option works. To generate channel configuration files you need

  1. a channel high level configuration json file : in there you define only the most important properties of your channel. A typical example is the listening (tcp) port, the punchlets and the output elasticsearch cluster.
  2. template file to generate the detailed configuration files : these are .j2 jinja2 files, one for each required channel configuration file.

Trying Kafka

Have a look at the tenants/mytenant/etc/channel_config folder. There you will find the channel high level configuration json files.

Next have a look next at the tenants/mytenant/etc/templates folders. One (single) can be used to generate the example channels you just executed. The second (input_kafka_processing) is a variant to generate channels made of two topologies with a Kafka topic in between.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# make sure you are at the top of your platform configuration folder
cd $PUNCHPLATFORM_CONF_DIR

# Stop your channels
punchplatform-channel.sh --stop mytenant

# Re-generate your apache channel using the input_kafka_processing template
punchplatform-channel.sh \
  --configure tenants/mytenant/etc/channel_config/apache_httpd_channel.json \
  --profile input_kafka_processing
# answer yes to override your current channel generation

# ready to go, restart your channel
punchplatform-channel.sh --start mytenant/apache_httpd

# inject some data
punchplatform-log-injector.sh -c resources/injector/mytenant/apache_httpd_injector.json

Go have a quick look at the channel generated files. You should find out easily that your channel is now composed of two topologies, the first one pushing the data to a Kafka topic, the second one consuming that topic to parse the logs and insert them into elasticsearch. An easy way to visualise this new setup is to visit the Storm UI on http://localhost:8080. You should see your two topologies.

Note

in the tenants/mytenant/channels/apache_httpd folder, have a quick look at the channel_structure.json file. This is the one that defines the overall structure of your channel. Compare it to the sourcefire original channel.

This concludes our 10 minutes tour, in order to come back to the original single channel layout, simply type in :

1
2
3
4
5
punchplatform-channel.sh --stop mytenant
# the -f option is to force the generation without asking you to confirm
punchplatform-channel.sh -f \
  --configure tenants/mytenant/etc/channel_config/apache_httpd_channel.json \
  --profile single