SPARKCTL¶

NAME¶

1	`sparkctl: submit a spark application either in foreground or on a spark-cluster`

DESCRIPTION¶

sparkctl is a low-level shell designed to be as an internal shell only. This shell is subject to change without notice.

With sparkctl, you can submit spark application using a java or python runtime.

Execution of spark batch and streaming applications is supported.

sparkctl --punchline mypunchline.hjson

OPTIONS¶

--punchline
- path of your configuration file
--spark-master
- submit to a spark cluster or not: 'local[*]' or spark://master:port
--spark-cluster
- submit to a spark cluster or by resolving spark_master url from your platform punchplatform.properties
--deploy-mode
- submission mode: 'client', 'cluster' or 'foreground'
--runtime
- 'spark' or 'pyspark'
--spark-work-dir
- working directory where enriched configuration files will be stored and where spark driver will stores it's file
--punchplatform-conf-dir
- root directory of punchplatform.properties
--verbose
- display executed command and display each punchline node to stdout as dataframe format
--no-color
- disable dataframe coloring in case your terminal does not support this feature
--hide-banner
- hide punch banner displayed on stdout

EXAMPLES¶

Launch a punchline using pyspark runtime:

sparkctl --punchline /tmp/punchline.hjson --runtime pyspark -v

LIBRARIES¶

sparkctl uses the library located :

$PUNCHPLATFORM_INSTALL_DIR/lib/punchplatform-analytics-client-*-jar-with-dependencies.jar

LOGGERS¶

The logging verbosity of sparkctl is controlled by the following two files:

$PUNCHPLATFORM_LOG4J_CONF_DIR/log4j2-punchline.xml
$PUNCHPLATFORM_LOG4J_CONF_DIR/log4j2.properties

ENVIRONMENT¶

sparkctl is used only internally and is not intended to be used by users. This shell is available on operator terminal environment but also on shiva and gateway nodes