If you are not familiar with Kafka, follow this short tour.
Kafka is a message broker in which you can produce (i.e. publish) and consume records. The standalone punch
has a kafka running already. All you have to do is to create a kafka
topic, and try producing and consuming record.
We will do that using the handy standalone tools. First let us create a topic:
punchplatform-kafka-topics.sh --create --kafkaCluster local --topic test_topic --replication-factor 1 --partitions 1
Each topic can be defined with some level or replication and number of partition. These are Kafka concepts. Next let us fill a topic with apache logs. To do that you must start apache_httpd and then inject some logs:
punchctl start --channel apache_httpd punchplatform-log-injector.sh -c $PUNCHPLATFORM_CONF_DIR/resources/injector/mytenant/apache_httpd_injector.json -n 100
Have a look at the
apache_httpd_injector.json file. It is self explanatory. Remember the punchplatform-log-injector.sh tool is extremelly powerful and enables you to produce arbitrary data,
that you can in turn send to kafka, elasticsearch, topologies etc..
Let us now check our messages are in our topic, as expected. You can again use the punch injector, but this time in comsumer mode:
punchplatform-log-injector.sh --kafka-consumer -topic mytenant_apache_httpd -brokers local -earliest
It should show your expected number of records. Try it also using
Now that you are familiar with some of the most important concepts used in the PunchPlatform, let's try to create a channel. To create new channels you have two options. First you can refer to the spouts and bolts documentation, and write your own. A second options is to work with templates to ease the channel configuration files generations.
Here is how this second option works. To generate channel configuration files you need
- a channel high level configuration json file : in there you define only the most important properties of your channel. A typical example is the listening (tcp) port, the punchlets and the output elasticsearch cluster.
- template file to generate the detailed configuration files : these are
.j2jinja2 files, one for each required channel configuration file.
Have a look at the
tenants/mytenant/etc/channel_config folder. There you will find the channel high level configuration json files.
Next have a look next at the
tenants/mytenant/etc/templates folders. One (
single) can be used to generate the example channels you just executed. The second (
input_kafka_processing) is a variant to generate channels made of two topologies with a Kafka topic in between.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# make sure you are at the top of your platform configuration folder cd $PUNCHPLATFORM_CONF_DIR # Stop your channels punchplatform-channel.sh --stop mytenant # Re-generate your apache channel using the input_kafka_processing template punchplatform-channel.sh \ --configure tenants/mytenant/etc/channel_config/apache_httpd_channel.json \ --profile input_kafka_processing # answer yes to override your current channel generation # ready to go, restart your channel punchplatform-channel.sh --start mytenant/apache_httpd # inject some data punchplatform-log-injector.sh -c resources/injector/mytenant/apache_httpd_injector.json
Go have a quick look at the channel generated files. You should find out easily that your channel is now composed of two topologies, the first one pushing the data to a Kafka topic, the second one consuming that topic to parse the logs and insert them into elasticsearch. An easy way to visualise this new setup is to visit the Storm UI on http://localhost:8080. You should see your two topologies.
tenants/mytenant/channels/apache_httpd folder, have a quick look at the
channel_structure.json file. This is the one that defines the overall structure of your
channel. Compare it to the sourcefire original channel.
This concludes our 10 minutes tour, in order to come back to the original single channel layout, simply type in :
1 2 3 4 5
punchplatform-channel.sh --stop mytenant # the -f option is to force the generation without asking you to confirm punchplatform-channel.sh -f \ --configure tenants/mytenant/etc/channel_config/apache_httpd_channel.json \ --profile single