Skip to content

punchplatform-analytics.sh

commands to launch PML analytic plans and jobs

Synopsis

  • punchplatform-analytics.sh --start <tenant><plan> [arguments]
  • punchplatform-analytics.sh --job <job_path> [arguments]

Description

punchplatform-analytics.sh provides the high level command to launch plans and jobs.

Command

  • --start <tenant><plan> [options]

    • Tenant friendly plan launcher. It starts a plan using the "job.template" and "plan.json" files located in the folder$PUNCHPLATFORM_CONF_DIR/tenants/<tenant>/plans/<plan>respectively as template and plan. The spark cluster is selected from "analytics_spark_cluster" field in tenant configuration json "$PUNCHPLATFORM_CONF_DIR/tenants/etc/conf.json".
  • --job <job_path> (--spark-cluster <cluster_name> | --spark-master <master_url>) [options]

    • Starts a job from the <job_path> configuration file.

"--job" and "--plan" commands

One of those options must be set:

Optional options but can be used to override default behavior:

  • --deploy-mode <client | cluster>

  • --name <name>

    • Set spark application name. If a plan is provided, name is <name>_<date>, <name> otherwise. Default with "--start" command: <tenant>_<plan> Default with "--job" command: <job_path> resolve to absolute path Default with "--plan" command: <plan_path>_<template_path> resolve to absolute paths
  • --tenant <tenant>

    • Set the PunchPlatform tenant name. It will be used to add application metrics into Elasticsearch. If no tenant is set, no metrics will be send to Elasticsearch. The default value is \'default\'.
  • --json

    • Set the submission output to JSON only. The returned JSON contains the "runtime_id" job runtime identifier to track the corresponding spark application status.
  • --date <date>

    • Set an execution date. In plan mode ("--start" and "--plan" commands), only one job is launched with a templating corresponding to this date. Its format must be ISO 8601 with offset from UTC ("2011-12-03T10:15:30+01:00").
  • --job <job_path>

    • Set the job configuration file.
  • --json-errors

    • With this option, in case of exception in validating the PML, the related errors will be reported to stderr as json Exception objects, for machine handling purpose
  • --verbose-data | -vd

    • With this option, the input and output data will be output to stdout for each node of the PML job
  • --verbose | -v

    • Activates all available details (includes -vd) : underlying java command line, full java stack trace in case of error, input/output data dump of PML nodes.

Optional option but can be used to override "punchplatform.properties" configuration:

  • --spark-home <path>

    • Set the spark installation location.
  • --client-jar <path>

    • Set the client jar location in client machine.
  • --client-jar-main <main>

    • Set the client jar main.
  • --app-resource-jar <path>

    • Set the app resource jar location in spark machine(s).
  • --app-resource-main <main>

    • Set the app resource jar main.
  • --additional-jars

    • Set app resource additional jar locations in spark machine(s).

Example

To start the plan "detection_suspect_url" of tenant "analytics":

1
$ punchplatform-analytics.sh --start analytics/detection_suspect_url

To start the plan "detection_suspect_url" of tenant "analytics" without spark cluster:

1
$ punchplatform-analytics.sh --deploy-mode client --spark-master local[*] --start analytics/detection_suspect_url

To start one job from the plan "detection_suspect_url" of tenant "analytics":

1
$ punchplatform-analytics.sh --date 2011-12-03T10:15:30+01:00 --start analytics/detection_suspect_url 

To start a job

1
$ punchplatform-analytics.sh --deploy-mode client --spark-master local[*] --job ./your_job.json