Skip to content

Parser Configuration Tree

Abstract

This chapter explains where the parsers and their related files are located.

The punch configuration folder of a platform is located under the $PUNCHPLATFORM_CONF_DIR folder. That folder holds all the configuration files of your platform. We focus here on those files related to log management:

  • parsers : a set of punchlets parsers, some of them generic to all log types, some of them specific to a particular log type.
  • elasticsearch templates : a set of template to define Elasticsearch mappings. In a nutshell, a mapping is a schema definition to be applied when indexing a document into a given Elasticsearch index. These mappings defines types like integer, String, date or even IPV4 ip addresses format.
  • grok patterns : a set of standard grok patterns (compatible with Logstash), made available to the punchlet Grok operator. A grok pattern is a predefined regex, it takes a String as input and convert it into a well organised Json document.
  • punchlets resources : punchlets can be packaged with their own resource files.
  • kibana dashboards : at the end of their journey, logs are stored into Elasticsearch, and rendered to you through a Kibana dashboard.

All these files must be designed carefully and consistently. Depending on the parsing result, you will probably make dedicated Elasticsearch templates and Kibana dashboards. Changing your parsers could make it non indexable anymore into Elasticsearch. In that sense, all these files must be created with caution and shipped as one consistent software package.

These files can be defined for all tenants, per tenants or per channel.

The directory layout to define resources at various level is illustrated next.

# the repository folder contains binary parser, java or python# packages. It is used if you work with maven based deployments.
├── repository
│       └─── firewall-parsers-1.0.0.zip
│
│ # the resource folder contains platform wide (i.e. for all tenants)# configuration files such as mappings, dashboards etc..
├── resources
│   ├── elasticsearch
│   └── kibana
│      
│ # Directory layout for per tenant/channel configuration files.# These are the ones use when starting/stopping a channel. # Do not update these files by hand, they are generated from# samples and templates.
└── tenants
   └── <your_tenant>
      │ 
      │ # The per tenant resources folder contains the punchlets, resource files# that you ship as plain files. They can be used only by punchline from# this tenant. 
      ├── resources
      │     └── punch
      │          └── org
      │             └── thales
      │                 └── punch
      │                       │ # the org/thales/punch path is important.
      │                       │ # it conforms to a groupId (org/thales/punch) identifier
      │                       │ # you typically have defined in a maven repository 
      │                       ├── common
      │                       │   ├── input.punch
      │                       │   └── syslog_header_parser.punch
      │                       └── apache_httpd
      │                           ├── resources
      │                           │   ├── http_codes.json
      │                           │   └── taxonomy.json
      │                           └── parser.punsh
      │ 
      │ # Your channels 
      └── channels
           ├── <your_channel>
           │   ├── channel_structure.json
           │   ├── <channel_topology_file(s)>.yml

As you can see you have two different ways to ship punchlets:

  1. using packages like firewall-parsers-1.0.0.zip. These packages are uniquely identified including a version number and can be used from all tenants. Each brings in a complete set of tested punchlets, grok pattenr and resource files.
  2. by directly inserting punchlet files into yoru configuration tree. You can do that only inside a tenant. Sharing punchlets among several tenants is dangerous and end up being error prone for your ops.

Important

In this example org/thales/punch is only used as an example. If you want to ship your own parsers or functions, use something like 'com/mycompany'.

Whatever you do make sure to conform the standard name scoping rules. Say for example you have an [input.punch] punchlet that is common to all your channels, because it (say) only adds a timestamp in your parsed log. A good practice is to put it under a common per tenant subdirectory:

conf/tenants/mytenant/resources/punch/org/thales/punch/common/input.punch

Now let's say you need to change that timestamp format only for a given log channel. Put an updated input.punch in the per channel resources directory.

conf/tenants/mytenant/resources/punch/org/thales/punch/common/<your_log_type>/input.punch

To refer to that punchlet in a punchline configuration file, refer to the Punch Node Here is ane xample of a punchlet nodes that refers to the above example files:

type: punchlet_node
component: my_punch_bolt
settings:
  punchlet_json_resources: 
  - org/thales/punch/apache_httpd/resources/http_codes.json
  - org/thales/punch/apache_httpd/resources/taxonomy.json
  punchlet:
  - org/thales/punch/common/input.punch
  - org/thales/punch/common/syslog_header_parser.punch
  - org/thales/punch/apache_httpd/parser.punch

Important

If you stick to the correct naming conventions your punchlines will work with local files of installed packaged seamlessly.