PunchNode (Punch Storm Spouts and Bolts 6.4.5-SNAPSHOT API)

java.lang.Object
- org.thales.punch.libraries.storm.api.BaseProcessingNode
- - org.thales.punch.libraries.storm.bolt.PunchNode

All Implemented Interfaces:

Serializable, org.apache.storm.task.IBolt, org.apache.storm.topology.IComponent, org.apache.storm.topology.IRichBolt
```
public class PunchNode
extends org.thales.punch.libraries.storm.api.BaseProcessingNode
```
The punch bolt executes punchlet(s) on the fly. This bolt cannot be used to communicate with an external application, it necessarily is internal to a topology. Its configuration looks like this :
Overview

Let us start wit a simple yet complete example.
```
 {
   type" : "punch_bolt",
   "bolt_settings" : {
     "punchlet" : "standard/Apache_HTTP_Server/apache.punch"
   },
   "storm_settings" : {
     "component" : "punch_bolt",
     "publish" : [ { "stream" : "logs", fields" : ["log", "_ppf_timestamp", "_ppf_id"]  }],
     "subscribe" : [ { "component" : "kafka_spout", "stream" : "logs", "grouping": "localOrShuffle"}
   }
 }
 
```
The punchlet property refers to the punchlet you want to execute in the bolt. You can have one or several that will then be executed in sequence.
Received Data
Your punchlet will receive json documents corresponding to the stream and fields of tuple received on the subscribed stream. In this example the bolt subscribes to a kafka spout on stream "logs". The received documents will look like this :
{ "logs" : { "field1" : ..., "field2" : ..., "fieldn" : ... } }
Where field1..n represent the tuple fields emitted by the Kafka spout on stream "logs". The punchlet can transform that json document in many ways. It typically adds or removes fields, with the intent to forward these new fields downstream. To do that the punch bolt must declare the list of published fields. In our examples
{ "logs" : { "log" : ..., "timestamp" : ..., "uuid" : ... } }
Should the punchlet generate a document with more fields, these will be ignored and not emitted. Conversely should it generate a document without some of the declared published field, an empty value will be emitted fot that field.
Because a punchlet can act on both the stream part and the fields part of the document, a punchlet can generate new streams, can garbage a tuple, can generate several tuples for one input tuple. All in all a punchlet can basically do almost everything you can think of from simple stateless forwarding to new event generation (for example alerting).
Punchlets and Resources Files

Punchlets are determined using a path, relative to the $PUNCHPLATFORM_CONF_DIR/resources/punch/punchlet folder. Some punchlets require additional resource files, typically when they use the findByKey or findByInterval Punch operator. Others use siddhi rule that must equivalently be loaded. To add resource files to your punchlet proceed as follows:
```
  {
    "type" : "punch_bolt",
    "bolt_settings" : {
      "punchlet_json_resources" : [
        "standard/Apache_HTTP_Server/enrichment.json"
      ],
      "punchlet_rule_resources" : [
        "standard/common/detection.rule"
      ],
      "punchlet" : "standard/Apache_HTTP_Server/enrichment.punch"
    },
    "storm_settings" : { ... }
  
```
Error Handling

Punchlets can raise exception, either intentionally (i.e. there is a missing required field in the document) or unintentionally (i.e. divide a value by 0). What happens then is the input document, together with the line number where the punchlet failed will be stored in an error document, emitted in turn on an error stream.
To make it easier to write, punch bolts may omit the declaration of that error stream. It will be implicitly added at topology build time. You can however decide to add that error stream explicitly should you prefer to deal with a complete explicit configuration file. Here is the same example than above with the explicit error declaration:
```
 {
   type" : "punch_bolt",
   "bolt_settings" : {
     "punchlet" : "standard/Apache_HTTP_Server/apache.punch"
   },
   "storm_settings" : {
     "component" : "punch_bolt",
     "publish" : [
       { "stream" : "logs", fields" : ["log", "_ppf_timestamp", "_ppf_id"]  },
       { "stream" : "_ppf_errors", fields" : ["_ppf_error", "_ppf_error_message",  "_ppf_timestamp", "_ppf_id"]  },
     ],
     "subscribe" : [ { "component" : "kafka_spout", "stream" : "logs", "grouping": "localOrShuffle"}
   }
 }
 
```
The error stream must be named "_ppf_errors".
Additional fields can be published in error stream, that can be either copied from the input stream (any field name is supported, as long at it is present in the subscribed stream), or generated by the PunchBolt :
- "_ppf_timestamp" : the standard input timestamp
- "_ppf_error_message" : the exception message or class that the punchlet raised at failure time.
- "_ppf_id" : the unique id of the input document
- "_ppf_platform" : the unique id of the punchplatform instance
- "_ppf_tenant" : the tenant name of the current channel
- "_ppf_channel" : the name of the channel containing the failed punchlet
- "_ppf_topology" : the name of the topology containing the failed punchlet
- "_ppf_component" : the component name of the PunchBolt containing the failed punchlet
- "_ppf_error" : the json document at start of the failed punchlet step
The point of all this is to let you configure the journey of error documents to an ultimate backend of your choice, an archive storage system or elasticsearch, so that you do not loose data and will ultimately be able to replay the data after correcting the punchlet(s).
Author:

dimi

See Also:

Serialized Form

Field Summary
- Fields inherited from class org.thales.punch.libraries.storm.api.BaseProcessingNode
  ackRate, collector, componentId, every, failRate, metricContext, nodeSettings, topologyContext, traversalTime

Constructor Summary

Constructors
Constructor and Description

PunchNode(org.thales.punch.libraries.storm.api.NodeSettings config, PunchletConfig punchletConfig)
Constructor.

Constructors
Constructor and Description
`PunchNode(org.thales.punch.libraries.storm.api.NodeSettings config, PunchletConfig punchletConfig)` Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`cleanup()`
`Map<String,Object>`	`getComponentConfiguration()` We request from storm a regular callback.
`void`	`prepare(Map stormConf, org.apache.storm.task.TopologyContext context, org.apache.storm.task.OutputCollector collector)`
`void`	`process(org.apache.storm.tuple.Tuple tuple)`

Methods inherited from class org.thales.punch.libraries.storm.api.BaseProcessingNode
ack, declareOutputFields, enrichAndForwardMonitoringTuple, execute, fail, getNewPoint

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - PunchNode
```
public PunchNode(org.thales.punch.libraries.storm.api.NodeSettings config,
                 PunchletConfig punchletConfig)
```
    Constructor.
    
    Parameters:
    
    config - the punch bolt settings.
    
    punchletConfig - the punchlet config
- Method Detail
  - prepare
```
public void prepare(Map stormConf,
                    org.apache.storm.task.TopologyContext context,
                    org.apache.storm.task.OutputCollector collector)
```
    Specified by:
    
    prepare in interface org.apache.storm.task.IBolt
    
    Overrides:
    
    prepare in class org.thales.punch.libraries.storm.api.BaseProcessingNode
  - process
```
public void process(org.apache.storm.tuple.Tuple tuple)
```
    Specified by:
    
    process in class org.thales.punch.libraries.storm.api.BaseProcessingNode
  - getComponentConfiguration
```
public Map<String,Object> getComponentConfiguration()
```
    We request from storm a regular callback. Storm will give us periodically a tick tuple.
    
    Specified by:
    
    getComponentConfiguration in interface org.apache.storm.topology.IComponent
    
    Overrides:
    
    getComponentConfiguration in class org.thales.punch.libraries.storm.api.BaseProcessingNode
  - cleanup
```
public void cleanup()
```
    Specified by:
    
    cleanup in interface org.apache.storm.task.IBolt
    
    Overrides:
    
    cleanup in class org.thales.punch.libraries.storm.api.BaseProcessingNode

Class PunchNode

Overview

Received Data

Punchlets and Resources Files

Error Handling

Field Summary

Fields inherited from class org.thales.punch.libraries.storm.api.BaseProcessingNode

Constructor Summary

Method Summary

Methods inherited from class org.thales.punch.libraries.storm.api.BaseProcessingNode

Methods inherited from class java.lang.Object

Constructor Detail

PunchNode

Method Detail

prepare

process

getComponentConfiguration

cleanup