Skip to content

HOWTO add your analytics algorithm

Why do that

You may want to add you own algorithm available as part of the Punchline sdk. Doing that will allow users to add a node or stage in their Punchline configuration file to leverage your algorithm.

Prerequisites

You need a punch-standalone installed with spark.

What to do

Implement the spark ML interface

First implements the spark machine-learning public interfaces:

There is no official spark documentation for this. But you can look at this O\'Reilly page, and?or have a look at the already implemented ML algorithms code source.

Deploy Your Jar

Compile and package your algorithm using your favorite tool (maven, sbt, ...). Note that your jar must not embed the many spark libraries that are already shipped with the punchplatform. On a standalone the core spark jars are located under

# Copy built jar to
$PUNCHPLATFORM_INSTALL_DIR/extlib/spark/

Use the algorithm in a PML configuration

You can now refer to your algorithm in a pipeline_stage:

{
  version: "6.0"
  runtime: spark
  type: punchline
  tenant: mytenant
  dag: [...]
  settings: {
      spark.additional.jars: my_ml.jar
  }
}

and in your MlTransformer node:

{
    type: your.algorithm.Name
    settings: {
        # your ML parameters
    }
}