Skip to content

SPARKCTL

NAME

1
sparkctl: submit a spark application either in foreground or on a spark-cluster

DESCRIPTION

sparkctl is a low-level shell designed to be as an internal shell only. This shell is subject to change without notice.

With sparkctl, you can submit spark application using a java or python runtime.

Execution of spark batch and streaming applications is supported.

sparkctl --punchline mypunchline.hjson

OPTIONS

  • --punchline

    • path of your configuration file
  • --spark-master

    • submit to a spark cluster or not: 'local[*]' or spark://master:port
  • --spark-cluster

    • submit to a spark cluster or by resolving spark_master url from your platform punchplatform.properties
  • --deploy-mode

    • submission mode: 'client', 'cluster' or 'foreground'
  • --runtime

    • 'spark' or 'pyspark'
  • --spark-work-dir

    • working directory where enriched configuration files will be stored and where spark driver will stores it's file
  • --punchplatform-conf-dir

    • root directory of punchplatform.properties
  • --verbose

    • display executed command and display each punchline node to stdout as dataframe format
  • --no-color

    • disable dataframe coloring in case your terminal does not support this feature
  • --hide-banner

    • hide punch banner displayed on stdout

EXAMPLES

Launch a punchline using pyspark runtime:

sparkctl --punchline /tmp/punchline.hjson --runtime pyspark -v

LIBRARIES

sparkctl uses the library located :

  • $PUNCHPLATFORM_INSTALL_DIR/lib/punchplatform-analytics-client-*-jar-with-dependencies.jar

LOGGERS

The logging verbosity of sparkctl is controlled by the following two files:

  • $PUNCHPLATFORM_LOG4J_CONF_DIR/log4j2-punchline.xml
  • $PUNCHPLATFORM_LOG4J_CONF_DIR/log4j2.properties

ENVIRONMENT

sparkctl is used only internally and is not intended to be used by users. This shell is available on operator terminal environment but also on shiva and gateway nodes

SEE ALSO

punchlinectl channeltcl planctl environment