Skip to content

Runtime resolver configuration (resolv.hjson)

Overview

Before deploying your platform, you must provide an additional resolv.hjson file in the $PUNCHPLATFORM_CONF_DIR of your deployment environment This file helps reducing the verbosity of channels and punchline configuration files during day to day operation and provides for automatic address resolution and completion of internal component addresses when operator submits application or channels start command.

This file defines translation between logical settings (e.g. an Elasticsearch cluster logical name) and actual settings (that may depend on where from the communication occurs, through which filter, nodes...).

This helps having configuration files that can be easily moved from one platform to an other, with minimal constraints.

A resolv.hjson file must be located, at deployment time in a platforms/<platformName> sub-folder of your deployment configuration directory, where platformName is typically 'production'. A symbolic link named resolv.hjson must be set next to the configuration punchplatform-deployment.settings file in $PUNCHPLATFORM_CONF_DIR.

When using operator tools, a local resolver file is pointed by PUNCHPLATFORM_RESOLV_FILE. By default it points to a deployment-time copy of the resolv.hjson provided during deployment.

Each time this file evolves, it should be deployed to all applicable nodes through running punchplatform-deployer.sh deploy -t platform_configuration. The new file content will be applied at any punch application start or restart.

The same goes for processing occurring on other nodes (shiva, gateway), although in this case, the resolver file will be the one set up inside the shiva/gateway setup dir (often, /data/opt/punch-shiva- respectively /data/opt/punch-gateway- folders).

An example will best illustrate the role of the resolv.hjson file. Consider the following resolv.hjson:

{ 
   // All ES input/output nodes (Storm nodes)
   elasticsearch_nodes:{ 
      selection:{ 
         tenant:*
         channel:*
         runtime:*
      }
      match:$.dag[?(@.type=='elasticsearch_input' || @.type=='elasticsearch_output')].settings
      additional_values:{ 
         http_hosts:[ 
            { 
               host:node2
               port:9200
            }
         ]
      }
   }
   // All ES spark nodes 
   elastic_nodes:{ 
      selection:{ 
         tenant:*
         channel:*
         runtime:*
      }
      match:$.dag[?(@.type=='elastic_batch_input' || @.type=='elastic_batch_output' || @.type=='elastic_stream_output' || @.type=='elastic_input' || @.type=='elastic_query_stats' || @.type=='python_elastic_input' || @.type=='python_elastic_output')].settings
      additional_values:{ 
         nodes:[ 
            node3
         ]
      }
   }
}

What you express here is that all elasticsearch_input or elasticsearch_output nodes from all tenants and channels and whatever be the runtime type of your punchlines (spark pyspark or storm) should be forwarded to the elasticsearch node node2. In contrast all elastic_batch_input and python_elastic_output nodes should rather send their data to node3.

Tip

this is not only a convenient feature. It is a fundamental feature to ensure end-users do not mess with low level platform configuration issues, and leave it to the integrator/administrator to define upfront the consistent data routing usage of tenant and channel applications. In addition is also allow the administrator to concentrate in one secured file important security configuration settings such as secrets location (certificates directory), not published in public user level configuration.

This file is thus a serie of user defined rules, each rule defined by a unique id and composed with three mandatory sections described hereafter.

Selection

The selection section allows you to determine if the rule is applicable for the file before submitting it to shiva, storm or spark. Resolver is only applicable to files inside a channel (except channel_structure.json) with .json, .hjson or .yaml extension

You have to define three mandatory parameters inside selection section :

  • tenant
  • channel
  • runtime

By this way you can select a specific rule for a use case.

For example if you want to define a rule for all applications in a specific channel:

selection:{ 
         tenant: *
         channel: apache_httpd
         runtime: *
}

Or for all applications running in shiva :

selection:{ 
         tenant: *
         channel: *
         runtime: shiva
}

The * is a wildcard, it means for every tenant, channel or runtime

Match

Once you defined your rule scope with selection settings, you have to define which files will be concerned by your rule

Concretely, if your selection section matched all files from a specific channel and your channel is composed with different applications, you may want to define your elasticsearch host only for one application.

To do that, we use a JsonPath expression, it acts as a select in json file

For example for a json file input :

{
  "metrics": {
    "reporters": [
      {
        "type": "kafka"
      },
      {
        "type": "elasticsearch"
      }
    ]
  }
}

If you want to select only elasticsearch reporter section, you have to set this json path expression :

$.metrics.reporters[?(@.type=='elasticsearch')]

It will retrieve :

[
   {
      "type" : "elasticsearch"
   }
]

All files which matched selection section AND this section will be enrich, other ones will be submitted without modifications

We use Jayway JsonPath lirairies internaly, if you want to test your match section before submitting it, you can use this online evaluator

Additional values

Finally, once your file match both previous section, you can define what you want to add to your file by defining the third section :

additional_values:{ 
         nodes:[ 
            server2
         ]
      }

It will append these lines where the json path expression matched

To summarise how the resolver works in a few words : the first two sections of each rules permits to select which file and which section of this file you want to enrich. If the parameters match, the additional lines will be added to the file before submission. Multiple rules can match.

Debuggin the resolver

A punch cli is available in order to debug the resolver behaviour. You can see what result file the resolver has computed from a given user file:

punchlinectl resolve -p punchline.json 

It simply prints to stdout the resolved output file. Use it to check it behaves as expected.