Skip to content

SPARKCTL

NAME

1
sparkctl: submit a spark application either in foreground or on a spark-cluster

Description

sparkctl is a low-level shell designed to be as an internal shell only. This shell is subject to change without notice.

With sparkctl, you can submit spark application using a java or python runtime.

Execution of spark batch and streaming applications is supported.

sparkctl --punchline mypunchline.hjson

OPTIONS

  • --punchline

    • path of your configuration file
  • --spark-master

    • submit to a spark cluster or not: 'local[*]' or spark://master:port
  • --deploy-mode

    • submission mode: 'client', 'cluster' or 'foreground'
  • --runtime

    • 'spark' or 'pyspark'
  • --spark-work-dir

    • working directory where enriched configuration files will be stored and where spark driver will stores it's file
  • --punchplatform-conf-dir

    • root directory of punchplatform.properties
  • --verbose

    • display executed command and display each punchline node to stdout as dataframe format
  • --no-color

    • disable dataframe coloring in case your terminal does not support this feature
  • --hide-banner

    • hide punch banner displayed on stdout

EXAMPLES

Launch a punchline using pyspark runtime:

sparkctl --punchline /tmp/punchline.hjson --runtime pyspark -v

LOGGERS

The logging verbosity of sparkctl is controlled by the following two files:

  • $PUNCHPLATFORM_OPERATOR_INSTALL_DIR/log4j2/log4j2-punchline.xml
  • $PUNCHPLATFORM_OPERATOR_INSTALL_DIR/log4j2/log4j2.properties

ENVIRONMENT

sparkctl is used only internally and is not intended to be used by users. This shell is available on operator terminal environment.

SEE ALSO

punchlinectl channeltcl planctl environment