Runtime resolver configuration (resolv.yaml)¶
Overview¶
Before deploying your platform, you must provide an additional resolv.hjson
file in the $PUNCHPLATFORM_CONF_DIR of your deployment environment
This file helps to reduce the verbosity of channels and punchline configuration files during day to day operation
and provides for automatic address resolution and completion of internal component addresses when operator submits application or channels start command.
This file defines translation between logical settings (e.g. an Elasticsearch cluster logical name) and actual settings (that may depend on where from the communication occurs, through which filter, nodes...).
This helps to have configuration files that can be easily moved from one platform to another, with minimal constraints.
A resolv.hjson file must be located, at deployment time in platforms/<platformName>
sub-folder of your deployment
configuration directory, where platformName is typically 'production'. A symbolic link named
resolv.hjson
must be set next to the configuration punchplatform-deployment.settings file in $PUNCHPLATFORM_CONF_DIR.
When using operator tools, a local resolver file is pointed by PUNCHPLATFORM_RESOLV_FILE. By default, it points to a deployment-time copy of the resolv.hjson provided during deployment.
Each time this file evolves, it should be deployed to all applicable nodes through running punchplatform-deployer.sh deploy -t platform_configuration
. The new file content will be applied at any punch application start or restart.
The same goes for processing occurring on other nodes (shiva, gateway), although in this case, the resolver file will be the one set up inside the shiva/gateway setup dir (often, /data/opt/punch-shiva- respectively /data/opt/punch-gateway- folders).
An example will best illustrate the role of the resolv.hjson file. Consider the following resolv.hjson:
{
// All ES output nodes (Storm & Spark nodes)
elasticsearch_nodes:{
selection:{
tenant:*
channel:*
runtime:*
name:*
}
match:$.dag[?(@.type=='elasticsearch_output' || @.type=='elastic_output')].settings
additional_values:{
http_hosts:[
{
host:node2
port:9200
}
]
}
}
extraction_node:{
selection:{
tenant:*
channel:*
runtime:*
name:*
}
match:$.dag[?(@.type=='extraction_input')].settings
additional_values:{
nodes:[
node2
]
}
}
// All ES spark input nodes
elastic_nodes:{
selection:{
tenant:*
channel:*
runtime:*
name:*
}
match:$.dag[?(@.type=='elastic_input' || @.type=='elastic_query_stats' || @.type=='python_elastic_input' || @.type=='python_elastic_output')].settings
additional_values:{
nodes:[
node3
]
}
}
}
What you express here is that all extraction_input
or elasticsearch_output
nodes from all tenants and channels and whatever be the
runtime type (spark pyspark or storm) and application name of your punchlines should be forwarded to the elasticsearch node node2
.
In contrast all elastic_input
and python_elastic_output
nodes should rather send their data to node3
.
Tip
this is not only a convenient feature. It is a fundamental feature to ensure end-users do not mess with low level platform configuration issues, and leave it to the integrator/administrator to define upfront the consistent data routing usage of tenant and channel applications. In addition is also allow the administrator to concentrate in one secured file important security configuration settings such as secrets location (certificates directory), not published in public user level configuration.
This file is thus a serie of user defined rules, each rule defined by a unique id and composed with three mandatory sections described hereafter.
Note
For application launched within a channel_structure
whose runtime is shiva, it is possible to resolve resources configurations by using key apply_resolver_on
(List(str)). Each element of the list should be a file name relative to your channel application directory.
This feature is intended to be used for applications outside the scope of planctl
and punchlinectl
. For instance, python application like elastalert
, elastichousekeeper
or put simply custom applications
DEBUG level of shiva daemon will show more info of apply_resolver_on
behavior.
Selection¶
The selection section allows you to determine if the rule is applicable for the file before submitting it to shiva, storm or spark. Resolver is only applicable to files inside a channel (except channel_structure.json) with .json, .hjson or .yaml extension
You can define four selection parameters inside selection section.
tenant
channel
runtime
of the punchline (spark, storm)name
of the applicationfile
name of the file to be resolvedhost
hostname of the server. Does not support local names and IPs like127.0.0.1
orlocalhost
This way you can select a specific rule for a use case.
For example if you want to define a rule for all applications in a specific channel:
selection:{
tenant: *
channel: apache_httpd
runtime: *
name: *
file: *
host: *
}
Or for all storm-like stream applications:
selection:{
tenant: *
channel: *
runtime: storm
name: *
file: *
host: *
}
Wildcard ('') can be used inside the selection filter value to represent any character sequence (e.g. name:ltr_ for applications which names start with 'ltr_') . A selection parameter that is not present is similar to using the '*' value.
Match¶
Once you defined your rule scope with selection settings, you have to define which files will be concerned by your rule
Concretely, if your selection section matched all files from a specific channel and your channel is composed with different applications, you may want to define your elasticsearch host only for one application.
To do that, we use a JsonPath expression, it acts as a select in json file
For example for a json file input :
{
"metrics": {
"reporters": [
{
"type": "kafka"
},
{
"type": "elasticsearch"
}
]
}
}
If you want to select only elasticsearch reporter section, you have to set this json path expression :
$.metrics.reporters[?(@.type=='elasticsearch')]
It will retrieve :
[
{
"type" : "elasticsearch"
}
]
All files which matched selection section AND this section will be enrich, other ones will be submitted without modifications
We use Jayway JsonPath lirairies internaly, if you want to test your match section before submitting it, you can use this online evaluator
Additional values¶
Finally, once your file match both previous section, you can define what you want to add to your file by defining the third section :
additional_values:{
nodes:[
server2
]
}
It will append these lines where the json path expression matched
To summarise how the resolver works in a few words : the first two sections of each rules permits to select which file and which section of this file you want to enrich. If the parameters match, the additional lines will be added to the file before submission. Multiple rules can match.
Debuggin the resolver¶
A punch cli is provided for debugging Punch Resolver outputs. The computed output can be display with the command below:
punchlinectl -t mytenant resolve -p punchline.json
As you can see, the resolver output the resolved punchline on STDOUT. This command is intended for debug only.
Instead of '-p', if you resolve some other configuration file (channel_structure.hjson, channels monitoring configuration...), use '-f'.
Remember that resolution with this tool takes into account :
- the tenant (-t or PUNCHPLATFORM_TENANT environment variable, or tenant folder name),
- the channel (-c or PUNCHPLATFORM_CHANNEL environment variable, or channel folder name)
- the application name (-n or PUNCHPLATFORM_APP_NAME environment variable, or punchline file basename without extension)
- the runtime (defined inside punchline files)