You are Impatient !¶

Tip

If you have 20 minutes, take time to go through the other Getting Started chapters.

If you have only 2 minutes, read this chapter.

Run a Punchline¶

Push Elasticsearch templates to configure index for the demo :

punchplatform-push-es-templates.sh \
    -d $PUNCHPLATFORM_CONF_DIR/resources/elasticsearch/templates/ \
    -l http://localhost:9200

Import Kibana resources to have dashboards and index patterns for the demo :

punchplatform-setup-kibana.sh --import -l http://localhost:5601

Start the channelctl command for the platform tenant :

channelctl --tenant platform start

Check your platform tenant channels status :

channelctl --tenant platform status

The platform tenant is used for monitoring the Punchplatform.

Now, let's play with a tenant with real data pipeline.

This lists all the channels installed by default on your Standalone. Each channel is a complete data pipeline.

Let's start a channel that is a typical ELK-like example :

channelctl --tenant mytenant start --channel stormshield_networksecurity

A Punchline is now running and ready to receive logs.

Now, we will inject some logs using the Punch injector tool. It will generate fake Stormshield logs and send them to your Punchline.

punchplatform-log-injector.sh -c $PUNCHPLATFORM_CONF_DIR/resources/injectors/mytenant/stormshield_networksecurity_injector.json

Check your Elasticsearch, your logs are indexed in mytenant-events-*.

Go to Kibana, you can explore your logs on the mytenant-events-* index pattern.

To stop your channel :

channelctl -t mytenant stop --channel stormshield_networksecurity

Congratulation ! You just managed a complete ELK-like production pipeline !

Explanation¶

The Big Data pipeline you started is described by the ìnput.yaml :

version: '6.0'
runtime: storm
type: punchline
channel: stormshield
meta:
  vendor: stormshield
dag:

  # Syslog input
  - type: syslog_input
    settings:
      listen:
        proto: tcp
        host: 127.0.0.1
        port: 9903
    publish:
      - stream: logs
        fields:
          - log
          - _ppf_local_host
          - _ppf_local_port
          - _ppf_remote_host
          - _ppf_remote_port
          - _ppf_timestamp
          - _ppf_id

  # Punchlet node
  - type: punchlet_node
    settings:
      punchlet_json_resources: []
      punchlet:
        - punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/input.punch
        - punch-common-punchlets-1.0.0/com/thalesgroup/punchplatform/common/parsing_syslog_header.punch
        - punch-stormshield-parsers-1.0.0/com/thalesgroup/punchplatform/stormshield/network_security/parser_network_security.punch
    subscribe:
      - component: syslog_input
        stream: logs
    publish:
      - stream: logs
        fields:
          - log
          - _ppf_id
      - stream: _ppf_errors
        fields:
          - _ppf_error_message
          - _ppf_error_document
          - _ppf_id

  # ES Output
  - type: elasticsearch_output
    settings:
      per_stream_settings:
        - stream: logs
          index:
            type: daily
            prefix: mytenant-events-
          document_json_field: log
          document_id_field: _ppf_id
          additional_document_value_fields:
            - type: date
              document_field: '@timestamp'
              format: iso
        - stream: _ppf_errors
          document_json_field: _ppf_error_document
          additional_document_value_fields:
            - type: tuple_field
              document_field: ppf_error_message
              tuple_field: _ppf_error_message
            - type: date
              document_field: '@timestamp'
              format: iso
          index:
            type: daily
            prefix: mytenant-events-
          document_id_field: _ppf_id
    subscribe:
      - component: punchlet_node
        stream: logs
      - component: punchlet_node
        stream: _ppf_errors

# Topology metrics
metrics:
  reporters:
    - type: kafka

# Topology settings
settings:
  topology.component.resources.onheap.memory.mb: 56 # 56m * (3 nodes) = 168m

It implements a stream data pipeline which :

receives logs on a TCP socket.
enriches these logs with Punchlets.
indexes enriched logs into Elasticsearch.

There is a second file that describes how and where to run that application. It is called channel_structure.yaml. Its content is:

version: '6.0'
start_by_tenant: true
stop_by_tenant: true
applications:
- name: input
  runtime: shiva
  command: punchlinectl
  args:
  - start
  - --punchline
  - input.yaml
  - --childopts
  - -Xms256m -Xmx256m # will override topology.component.resources.onheap.memory.mb
  shiva_runner_tags:
  - common
  cluster: common
  reload_action: kill_then_start

Here it tells the Punch to start this pipeline in a Shiva cluster using punchlinectl.

punchlinectl leverages the Punch runtime engine. It is a simplified version of Storm engine.

Other runtime engines avaible on the Punch are the official Storm, and Spark.