Punchlines¶
The next concept to understand are Punchlines. A Punchline is a data pipeline, configured to fetch or receive data, to process it and push it downstream.
Here we will focus on stream punchline. In the next chapter, we will focus on batch punchline in Spark.
Have a look at the input.yaml
file :
version: '6.0'
runtime: storm
type: punchline
channel: stormshield
meta:
vendor: stormshield
dag:
# Syslog input
- type: syslog_input
settings:
listen:
proto: tcp
host: 127.0.0.1
port: 9903
publish:
- stream: logs
fields:
- log
- _ppf_local_host
- _ppf_local_port
- _ppf_remote_host
- _ppf_remote_port
- _ppf_timestamp
- _ppf_id
# Punchlet node
- type: punchlet_node
settings:
punchlet_json_resources: []
punchlet:
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/input.punch
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/parsing_syslog_header.punch
- punch-stormshield-parsers-1.0.0/com/thalesgroup/punchplatform/stormshield/network_security/parser_network_security.punch
subscribe:
- component: syslog_input
stream: logs
publish:
- stream: logs
fields:
- log
- _ppf_id
- stream: _ppf_errors
fields:
- _ppf_error_message
- _ppf_error_document
- _ppf_id
# ES Output
- type: elasticsearch_output
settings:
per_stream_settings:
- stream: logs
index:
type: daily
prefix: mytenant-events-
document_json_field: log
document_id_field: _ppf_id
additional_document_value_fields:
- type: date
document_field: '@timestamp'
format: iso
- stream: _ppf_errors
document_json_field: _ppf_error_document
additional_document_value_fields:
- type: tuple_field
document_field: ppf_error_message
tuple_field: _ppf_error_message
- type: date
document_field: '@timestamp'
format: iso
index:
type: daily
prefix: mytenant-events-
document_id_field: _ppf_id
subscribe:
- component: punchlet_node
stream: logs
- component: punchlet_node
stream: _ppf_errors
# Topology metrics
metrics:
reporters:
- type: kafka
# Topology settings
settings:
topology.component.resources.onheap.memory.mb: 56 # 56m * (3 nodes) = 168m
It implements a stream pipeline which :
- receives logs on a TCP socket.
- parses and enriches this log with Punchlets.
- indexes transformed logs into Elasticsearch.
Start this punchline in foreground :
punchlinectl --tenant mytenant start --punchline $PUNCHPLATFORM_CONF_DIR/tenants/mytenant/channels/stormshield_networksecurity/input.yaml
A stream pipeline is now running and ready to receive logs.
Now, we will inject some logs using the Punch injector tool. It will generate Stormshield logs and send them to your punchline.
In another terminal :
punchplatform-log-injector.sh -c $PUNCHPLATFORM_CONF_DIR/resources/injectors/mytenant/stormshield_networksecurity_injector.json
Check your Elasticsearch, your logs are indexed in mytenant-events-*
.
As you see, punchlines are quite simple to understand. You can do all sort of stream computing with them. Now that you have a good understanding of stream punchlines, let's move on to batch punchlines.