Punch templates¶
Abstract¶
To help you build a uniform set of punchlines, the punch leverage a jinja2 renderer. It allows you to generate punchlines or other configuration items. The purpose of this feature is to :
- ensure punchlines homogeneity
- provide a custom abstraction of the punchline
- ease the migration process by applying changes to templates only
- avoid bad configurations by using a validated and tested template
Punch templating is based on the jinjava implementation of jinja2.
Quick tour¶
With the following file structure :
.
└── tenants
└── mytenant
└── etc
│── channel_config
│ └── apache_httpd.yaml
└── templates
└── shiva_single_archiving
├── archiving.yaml.j2
├── channel_structure.yaml.j2
└── input.yaml.j2
apply the following command :
channelctl -t mytenant configure tenants/mytenant/etc/channel_config/apache_httpd.yaml
and you will generate the following files :
.
└── tenants
└── mytenant
├── channels
│ └── apache_httpd
│ ├── archiving.yaml
│ ├── channel_structure.yaml
│ └── input.yaml
└── etc
│── channel_config
│ └── apache_httpd.yaml
└── templates
└── shiva_single_archiving
├── archiving.yaml.j2
├── channel_structure.yaml.j2
└── input.yaml.j2
Configuration¶
Let us illustrate how to make templates for your channel composed of 2 pipelines, input and archiving. You will need to create the values file (tenants/mytenant/etc/channel_config/apache_httpd.yaml) and the jinja templates (_tenants/mytenant/etc/templates/shiva_single_archiving/.j2_). Do not forget to create a documentation* of your template (like a Readme).
1 Bring your working punchlines files¶
The following punchline and channel_structure files are the result you want to generate.
Channel Structure file
version: '6.0'
start_by_tenant: true
stop_by_tenant: true
resources:
- type: kafka_topic
name: mytenant_apache_httpd_archiving
cluster: common
partitions: 1
replication_factor: 1
applications:
- name: input
runtime: shiva
command: punchlinectl
args:
- start
- --punchline
- input.yaml
shiva_runner_tags:
- common
cluster: common
reload_action: kill_then_start
- name: archiving
runtime: shiva
command: punchlinectl
args:
- start
- --punchline
- archiving.yaml
shiva_runner_tags:
- common
cluster: common
reload_action: kill_then_start
Input punchline
version: '6.0'
runtime: storm
channel: apache_httpd
type: punchline
meta:
vendor: apache
technology: apache_httpd
dag:
# Syslog
- type: syslog_input
settings:
listen:
proto: tcp
host: 0.0.0.0
port: 9901
self_monitoring.activation: true
self_monitoring.period: 10
publish:
- stream: logs
fields:
- log
- _ppf_local_host
- _ppf_local_port
- _ppf_remote_host
- _ppf_remote_port
- _ppf_timestamp
- _ppf_id
- stream: _ppf_metrics
fields:
- _ppf_latency
# Punchlet node
- type: punchlet_node
component: punchlet
settings:
punchlet_json_resources:
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/resources/http_codes.json
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/resources/taxonomy.json
punchlet:
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/input.punch
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/parsing_syslog_header.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/parser_apache_httpd.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/enrichment.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/normalization.punch
subscribe:
- component: syslog_input
stream: logs
- component: syslog_input
stream: _ppf_metrics
publish:
- stream: logs
fields:
- log
- _ppf_id
- _ppf_timestamp
- stream: _ppf_errors
fields:
- _ppf_error_message
- _ppf_error_document
- _ppf_id
- stream: _ppf_metrics
fields:
- _ppf_latency
# ES Output
- type: elasticsearch_output
settings:
per_stream_settings:
- stream: logs
index:
type: daily
prefix: mytenant-events-
document_json_field: log
document_id_field: _ppf_id
additional_document_value_fields:
- type: date
document_field: '@timestamp'
format: iso
- stream: _ppf_errors
document_json_field: _ppf_error_document
additional_document_value_fields:
- type: tuple_field
document_field: ppf_error_message
tuple_field: _ppf_error_message
- type: date
document_field: '@timestamp'
format: iso
index:
type: daily
prefix: mytenant-events-
subscribe:
- component: punchlet
stream: logs
- component: punchlet
stream: _ppf_errors
- component: punchlet
stream: _ppf_metrics
# Kafka Output
- type: kafka_output
settings:
topic: mytenant_apache_httpd_archiving
encoding: lumberjack
producer.acks: all
producer.batch.size: 16384
producer.linger.ms: 5
subscribe:
- component: punchlet
stream: logs
- component: punchlet
stream: _ppf_metrics
metrics:
reporters:
- type: kafka
settings:
topology.component.resources.onheap.memory.mb: 200 # 200m * (4 nodes + 1 topo) = 1G
Archiving punchline
version: '6.0'
type: punchline
channel: apache_httpd
runtime: storm
dag:
# Kafka Input
- type: kafka_input
settings:
topic: mytenant_apache_httpd_archiving
start_offset_strategy: last_committed
fail_action: exit
publish:
- stream: logs
fields:
- log
- _ppf_id
- _ppf_timestamp
- _ppf_partition_id
- _ppf_partition_offset
- stream: _ppf_metrics
fields:
- _ppf_latency
# File Output
- type: file_output
settings:
create_root: true
destination: file:///tmp/archive-logs/storage # File system
topic: apache_httpd
file_prefix_pattern: '%{topic}/%{date}/puncharchive-%{tags}-%{offset}'
batch_size: 1000
batch_expiration_timeout: 10s
fields:
- _ppf_id
- _ppf_timestamp
- log
encoding: csv
compression_format: gzip
separator: __|__
timestamp_field: _ppf_timestamp
subscribe:
- component: kafka_input
stream: logs
- component: kafka_input
stream: _ppf_metrics
publish:
- stream: metadatas
fields:
- metadata
- stream: _ppf_metrics
fields:
- _ppf_latency
# ES Output
- type: elasticsearch_output
component: metadatas_indexer
settings:
per_stream_settings:
- stream: metadatas
index:
type: daily
prefix: mytenant-archive-
document_json_field: metadata
batch_size: 1
reindex_failed_documents: true
error_index:
type: daily
prefix: mytenant-archive-errors
subscribe:
- component: file_output
stream: metadatas
- component: file_output
stream: _ppf_metrics
# Metrics
metrics:
reporters:
- type: kafka
settings:
topology.component.resources.onheap.memory.mb: 56 # 56m * (3 nodes + 1 topo) = 224m
2 Identify the variables parts¶
You might want to reuse these punchlines so you can apply it for several tenant and several type of logs. In that case you will need to create template to easily generate the punchlines. The part of your punchlines you want to make templatize are :
- elasticsearch index names
- kafka topic name
- listening input port
- punchlets and json_resources
3 Create your value file¶
You can create your value file in the location you want. let us create it at the following path :
tenants/mytenant/etc/channel_config/apache_httpd.yaml
The following values are required :
-
channel
stringMANDATORY
The name of the channel you want to create
-
tenant
stringMANDATORY
The name of the tenant
-
channel_structure_profile
stringMANDATORY
The name of the template directory. This is not a full path but only the name of the directory itself. The configure command will look for a directory in the predefined template directory locations.
You can then set customs values to use in your template.
Here is the value file corresponding to the example above :
channel_structure_profile: shiva_single_archiving
vendor: apache
tenant: mytenant
channel: apache_httpd
input:
host: 0.0.0.0
port: 9901
json_resources:
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/resources/http_codes.json
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/resources/taxonomy.json
punchlets:
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/input.punch
- punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/parsing_syslog_header.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/parser_apache_httpd.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/enrichment.punch
- punch-webserver-parsers-1.0.0/com/thalesgroup/punchplatform/webserver/apache_httpd/normalization.punch
4 Create the template¶
Template directory locations¶
The configure command looks for a template directory with the name defined by the channel_structure_profile
value.
It looks at the following places, in this order :
- $PUNCHPLATFORM_CONF_DIR/tenants/mytenant/etc/templates/
- $PUNCHPLATFORM_CONF_DIR/tenants/mytenant/templates/
- $PUNCHPLATFORM_CONF_DIR/templates
Generated files controls¶
Each generated file will end up in the following directory: tenants/<tenant>/channels/<channel.name>/
.
The channel name is defined by the channel
value.
The Configure command first looks for a channel_structure file in the template directory and apply templating to it.
When the channel_structure file is generated, the configure command uses the provided applications list (applications
value in the channel_structure file) to generate the corresponding punchlines.
The following settings of your channel_structure applications allow you to control how your punchline are generated :
applications[].name
stringMANDATORY
the name of the generated file. The configure command finds out the extension by using the template file extension (e.g name : input and template file input.yaml.j2 generate input.yaml).
applications[].template
stringOPTIONAL
the name of the template file to use. if not specified it will look for a file matching <application_name>.yaml.j2 or <application_name>.yml.j2 or <application_name>.json.j2 or <application_name>.hjson.j2.
applications[].additional_templates
list of dictOPTIONAL
a list of additional template you want to render for this application. For instance, when configuring a plan you want to generate both the punchline and the plan files.
applications[].additional_templates[].template
stringOPTIONAL
the name of the additional template file to use.
applications[].additional_templates[].output
stringOPTIONAL
the name of the generated file. It must define the file extension (e.g input.yaml).
applications[].*
OPTIONAL
customs values to use in your template. They are only available at the rendered application level. this is useful if you plan to use the same template to generate multiple punchlines with a slight difference such as the kafka partition to consume.
Templates files¶
Inside the template you can use Jinja2 (jinjava) syntax.
The following jinja variables are available for channel_structure and application rendering :
channel.*
:all values defined inside your values file
channel.name
:the name of the channel (same as channel.channel)
PUNCHPLATFORM_CONF_DIR
the absolute path to the PUNCHPLATFORM_CONF_DIR
punchplatform.*
all the setting of your generated punchplatform.properties. You will find this file at $PUNCHPLATFORM_PROPERTIES_FILE
tenant
the tenant as declared in the values file (same as channel.tenant)
The following jinja variables are only available for application rendering
topology.*
all settings defined in your channel structure and relative to the application you are rendering (e.g {{ topology.shiva_runner_tags }} will give you the list of shiva tags defined).
The following example show you the template files used to generate the channel :
Template of Channel Structure file
version: '6.0'
start_by_tenant: true
stop_by_tenant: true
resources:
- type: kafka_topic
name: {{tenant}}_{{channel.name}}_archiving
cluster: common
partitions: 1
replication_factor: 1
applications:
- name: input
runtime: shiva
command: punchlinectl
args:
- start
- --punchline
- input.yaml
shiva_runner_tags:
- {{ channel.cluster_name|default('common') }}
cluster: {{ channel.cluster_name|default('common') }}
reload_action: kill_then_start
- name: archiving
runtime: shiva
command: punchlinectl
args:
- start
- --punchline
- archiving.yaml
shiva_runner_tags:
- common
cluster: common
reload_action: kill_then_start
Template of Input punchline
version: '6.0'
runtime: storm
channel: {{channel.name}}
type: punchline
meta:
vendor: {{channel.vendor}}
technology: {{channel.name}}
dag:
# Syslog
- type: syslog_input
settings:
listen:
proto: tcp
host: {{channel.input.host}}
port: {{channel.input.port}}
self_monitoring.activation: true
self_monitoring.period: 10
publish:
- stream: logs
fields:
- log
- _ppf_local_host
- _ppf_local_port
- _ppf_remote_host
- _ppf_remote_port
- _ppf_timestamp
- _ppf_id
- stream: _ppf_metrics
fields:
- _ppf_latency
# Punchlet node
- type: punchlet_node
component: punchlet
settings:
{%- if channel.json_resources is defined and channel.json_resources | length > 0 %}
punchlet_json_resources:
{%- for jsonResource in channel.json_resources %}
- {{ jsonResource }}
{%- endfor %}
{%- endif %}
punchlet:
{%- for punchlet in channel.punchlets %}
- {{punchlet}}
{%- endfor %}
subscribe:
- component: syslog_input
stream: logs
- component: syslog_input
stream: _ppf_metrics
publish:
- stream: logs
fields:
- log
- _ppf_id
- _ppf_timestamp
- stream: _ppf_errors
fields:
- _ppf_error_message
- _ppf_error_document
- _ppf_id
- stream: _ppf_metrics
fields:
- _ppf_latency
# ES Output
- type: elasticsearch_output
settings:
per_stream_settings:
- stream: logs
index:
type: daily
prefix: {{channel.tenant}}-events-
document_json_field: log
document_id_field: _ppf_id
additional_document_value_fields:
- type: date
document_field: '@timestamp'
format: iso
- stream: _ppf_errors
document_json_field: _ppf_error_document
additional_document_value_fields:
- type: tuple_field
document_field: ppf_error_message
tuple_field: _ppf_error_message
- type: date
document_field: '@timestamp'
format: iso
index:
type: daily
prefix: mytenant-events-
subscribe:
- component: punchlet
stream: logs
- component: punchlet
stream: _ppf_errors
- component: punchlet
stream: _ppf_metrics
# Kafka Output
- type: kafka_output
settings:
topic: {{channel.tenant}}_{{channel.name}}_archiving
encoding: lumberjack
producer.acks: all
producer.batch.size: 16384
producer.linger.ms: 5
subscribe:
- component: punchlet
stream: logs
- component: punchlet
stream: _ppf_metrics
metrics:
reporters:
- type: kafka
settings:
topology.component.resources.onheap.memory.mb: 200 # 200m * (4 nodes + 1 topo) = 1G
Template of Archiving punchline
version: '6.0'
type: punchline
channel: {{channel.name}}
runtime: storm
dag:
# Kafka Input
- type: kafka_input
settings:
topic: {{tenant}}_{{channel.name}}_archiving
start_offset_strategy: last_committed
fail_action: exit
publish:
- stream: logs
fields:
- log
- _ppf_id
- _ppf_timestamp
- _ppf_partition_id
- _ppf_partition_offset
- stream: _ppf_metrics
fields:
- _ppf_latency
# File Output
- type: file_output
settings:
create_root: true
destination: file:///tmp/archive-logs/storage # File system
topic: {{channel.name}}
file_prefix_pattern: '%{topic}/%{date}/puncharchive-%{tags}-%{offset}'
batch_size: 1000
batch_expiration_timeout: 10s
fields:
- _ppf_id
- _ppf_timestamp
- log
encoding: csv
compression_format: gzip
separator: __|__
timestamp_field: _ppf_timestamp
subscribe:
- component: kafka_input
stream: logs
- component: kafka_input
stream: _ppf_metrics
publish:
- stream: metadatas
fields:
- metadata
- stream: _ppf_metrics
fields:
- _ppf_latency
# ES Output
- type: elasticsearch_output
component: metadatas_indexer
settings:
per_stream_settings:
- stream: metadatas
index:
type: daily
prefix: {{tenant}}-archive-
document_json_field: metadata
batch_size: 1
reindex_failed_documents: true
error_index:
type: daily
prefix: {{tenant}}-archive-errors
subscribe:
- component: file_output
stream: metadatas
- component: file_output
stream: _ppf_metrics
# Metrics
metrics:
reporters:
- type: kafka
settings:
topology.component.resources.onheap.memory.mb: 56 # 56m * (3 nodes + 1 topo) = 224m