Skip to content

HOWTO add your analytics node

Why do that

If you want to add your own (java or scala) node in a PunchPlatform Machine-Learning json configuration.

Prerequisites

You need a punchplatform-standalone installed with spark.

What to do

Implement the Node interface

You must implements the node interface provided by the punchplatform-job library. You can install this dependencie inside your local maven repository with the command:

1
$ punchplatform-development.sh

Note

This command exports the various platform jars in your local maven repository so that you can include dependencies in your maven projects. Of course if you are part of the punch community, you will directly work with the punch git repositories.

Add the punchplatform-job dependencie printed by this command to you maven module pom. You can now create a class implemented the interface :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import org.thales.punch.ml.configuration.JacksonConfigurationName;
import org.thales.punch.ml.job.Input;
import org.thales.punch.ml.job.Output;
import org.thales.punch.ml.job.Node;

import org.thales.punch.ml.configuration.NodeName;
import org.thales.punch.ml.configuration.NodeType;
import org.thales.punch.ml.configuration.NodeType.Type;

@NodeName("your_node_name")
@NodeType(Type.OUTPUT_NODE)
public class YourNode implements Node {

  private static final long serialVersionUID = 1L;

  @JsonProperty(value = "param_1")
  public String param_1 = "default_param";

  @JsonCreator
  public YourNode() {
    super();
  }

  @Override
  public void execute(Input input, Output output) throws Exception {
    System.out.println(input.getSingleton().get());
  }

  @Override
  public void declare(IDeclarer declarer) throws Exception {
    declarer.subscribeSingleton(new TypeReference<Dataset<Row>>() {});
  }
}

Deploy your Jar

Compile your algorithm into a jar (with maven, sbt, ...). To make a lightweight jar, consider provided libraries from spark and PunchPlatform:

1
2
$ punchplatform-standalone-*/external/spark-2.2.1-bin-hadoop2.7/jars
$ punchplatform-standalone-*/external/spark-2.2.1-bin-hadoop2.7/punchplatform/analytics/job/additional_jars

Use Your Node a PML configuration

You can now refer to your algorithm in a pipeline_stage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
{
    type: your_node_name
    component: your_component
    settings: {
        param_1: hello
    }
    subscribe: [
      {
          stream: input_stream
          component: input_component
      }
    ]
}