Skip to content

HOWTO add your analytics node

Why do that

Developping your custom PML node and use it in any PML configuration file

Prerequisites

An installed version of the punchplatform standalone version with spark.

What to do

Implement the Node interface

You must implement the node interface provided by the punchplatform-job library. You can install this dependency in your local maven repository by executing the command:

1
$ punchplatform-development.sh

Note

This command exports the platform jars in your local maven repository so that you can include dependencies in your maven projects. Of course if you are part of the punch community, you will directly work with the punch git repositories.

Add the punchplatform-job dependency printed by this command to you maven pom.xml file. You can now create a class implemented the interface :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import org.thales.punch.ml.configuration.JacksonConfigurationName;
import org.thales.punch.ml.job.Input;
import org.thales.punch.ml.job.Output;
import org.thales.punch.ml.job.Node;

import org.thales.punch.ml.configuration.NodeName;
import org.thales.punch.ml.configuration.NodeType;
import org.thales.punch.ml.configuration.NodeType.Type;

@NodeName("your_node_name")
@NodeType(Type.OUTPUT_NODE)
public class YourNode implements Node {

  private static final long serialVersionUID = 1L;

  @JsonProperty(value = "param_1")
  public String param_1 = "default_param";

  @JsonCreator
  public YourNode() {
    super();
  }

  @Override
  public void execute(Input input, Output output) throws Exception {
    System.out.println(input.getSingleton().get());
  }

  @Override
  public void declare(IDeclarer declarer) throws Exception {
    declarer.subscribeSingleton(new TypeReference<Dataset<Row>>() {});
  }
}

Deploy your Jar

Compile your algorithm into a jar (with maven, sbt, ...). To make a lightweight jar, consider provided libraries from spark and PunchPlatform:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
### Copying your jar

## Setup
# let us suppose that we want to patch v5.6.0 of our standalone version
export YOUR_STANDALONE_VERSION=5.6.0
# let us suppose that this standalone comes with spark 2.4.3
export SPARK_VERSION=2.4.3
# let us suppose that your jar name is as below
export YOUR_JAR=myjar.jar

# Create directory if missing
# Directory one
mkdir -p $PUNCHPLATFORM_CONF_DIR/../external/punchplatform-operator-environment-${YOUR_STANDALONE_VERSION}/lib/custom_jars
# Directory two
mkdir -p $PUNCHPLATFORM_CONF_DIR/../external/external/punchplatform-${YOUR_STANDALONE_VERSION}/features/plugins/spark-${SPARK_VERSION}-bin-hadoop2.7/punchplatform/analytics/job/custom_jars

# copying the jars
cp ${YOUR_JAR} $PUNCHPLATFORM_CONF_DIR/../external/punchplatform-operator-environment-${YOUR_STANDALONE_VERSION}/lib/custom_jars/

cp ${YOUR_JAR} $PUNCHPLATFORM_CONF_DIR/../external/external/punchplatform-shiva-${YOUR_STANDALONE_VERSION}/features/plugins/spark-${SPARK_VERSION}-bin-hadoop2.7/punchplatform/analytics/job/custom_jars/

Use Your Node a PML configuration

You can now refer to your algorithm in a pipeline_stage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
{
    type: your_node_name
    component: your_component
    settings: {
        param_1: hello
    }
    subscribe: [
      {
          stream: input_stream
          component: input_component
      }
    ]
}