Skip to content

resolver.hjson

Overview

Before deploying your platform, you must provide an additional resolv.hjson file. This file helps reducing the verbosity of channel and punchline configuration files and provides for automatic address resolution and completion.

The resolv.hjson file must be located in a platforms/<platformName> sub-folder of your deployment configuration directory, where platformName is typically 'production'. A symbolic link named punchplatform-deployment.settings must next be set from the configuration root folder. Remember you have a PUNCHPLATFORM_CONF_DIR environment variable defining that location.

When using the punch command-line tools, the PunchPlatform configuration root folder must be provided using the PUNCHPLATFORM_CONF_DIR environment variable. That is, it must look like this:

> $PUNCHPLATFORM_CONF_DIR
    ├── punchplatform-deployment.settings -> platform/singlenode/punchplatform-deployment.settings
    ├── resolv.hjson -> platform/singlenode/resolv.hjson
    └── platform
        └── singlenode
            └── punchplatform-deployment.settings
            └── resolv.hjson

An example will best illustrate the role of the resolv.hjson file. Consider the following resolv.hjson:

{ 
   // All ES input/output nodes (Storm nodes)
   elasticsearch_nodes:{ 
      selection:{ 
         tenant:*
         channel:*
         runtime:*
      }
      match:$.dag[?(@.type=='elasticsearch_input' || @.type=='elasticsearch_output')].settings
      additional_values:{ 
         http_hosts:[ 
            { 
               host:node2
               port:9200
            }
         ]
      }
   }
   // All ES spark nodes 
   elastic_nodes:{ 
      selection:{ 
         tenant:*
         channel:*
         runtime:*
      }
      match:$.dag[?(@.type=='elastic_batch_input' || @.type=='elastic_batch_output' || @.type=='elastic_stream_output' || @.type=='elastic_input' || @.type=='elastic_query_stats' || @.type=='python_elastic_input' || @.type=='python_elastic_output')].settings
      additional_values:{ 
         nodes:[ 
            node3
         ]
      }
   }
}

What you express here is that all elasticsearch_input or elasticsearch_output nodes from all tenants and channels and whatever be the runtime type of your punchlines (spark pyspark or storm) should be forwarded to the elasticsearch node node2. In contrast all elastic_batch_input and python_elastic_output nodes should rather send their data to node3.

Tip

this is not only a convenient feature. It is a fundamental feature to ensure end-users do not mess with low level platform configuration issues, and leave it to the administrator to define upfront the consistent data routing usage of tenant and channel applications. In addition is also allow the administrator to concentrate in one secured file important security configuration settings such as secret and token, not visible from user level files.

This file is thus a serie of user defined rules, each rule defined by a unique id and composed with three mandatory sections described hereafter.

Selection

The selection section allows you to determine if the rule is applicable for the file before submitting it to shiva, storm or spark. Resolver is only applicable to files inside a channel (except channel_structure.json) with .json, .hjson or .yaml extension

You have to define three mandatory parameters inside selection section :

  • tenant
  • channel
  • runtime

By this way you can select a specific rule for a use case.

For example if you want to define a rule for all applications in a specific channel:

selection:{ 
         tenant: *
         channel: apache_httpd
         runtime: *
}

Or for all applications running in shiva :

selection:{ 
         tenant: *
         channel: *
         runtime: shiva
}

The * is a wildcard, it means for every tenant, channel or runtime

Match

Once you defined your rule scope with selection settings, you have to define which files will be concerned by your rule

Concretely, if your selection section matched all files from a specific channel and your channel is composed with different applications, you may want to define your elasticsearch host only for one application.

To do that, we use a JsonPath expression, it acts as a select in json file

For example for a json file input :

{
  "metrics": {
    "reporters": [
      {
        "type": "kafka"
      },
      {
        "type": "elasticsearch"
      }
    ]
  }
}

If you want to select only elasticsearch reporter section, you have to set this json path expression :

$.metrics.reporters[?(@.type=='kafka')]

It will retrieve :

[
   {
      "type" : "elasticsearch"
   }
]

All files which matched selection section AND this section will be enrich, other ones will be submitted without modifications

We use Jayway JsonPath lirairies internaly, if you want to test your match section before submitting it, you can use this online evaluator

Additional values

Finally, once your file match both previous section, you can define what you want to add to your file by defining the third section :

additional_values:{ 
         nodes:[ 
            server2
         ]
      }

It will append these lines where the json path expression matched

To summarise how the resolver works in a few words : the first two sections of each rules permits to select which file and which section of this file you want to enrich. If the parameters match, the additional lines will be added to the file before submission. Multiple rules can macth.

Debuggin the resolver

A punch cli is available in order to debug the resolver :

punchlinectl resolve -p punchline.json 

It simply prints to stdout the resolved output file. Use it to check it behaves as expected.