sparkctl is a low-level shell designed to be as an internal shell only. This shell is subject to change without notice.
With sparkctl, you can submit spark application using a java or python runtime.
Execution of spark batch and streaming applications is supported.
sparkctl --punchline mypunchline.hjson
- path of your configuration file
- submit to a spark cluster or not: 'local[*]' or spark://master:port
- submit to a spark cluster or by resolving spark_master url from your platform punchplatform.properties
- submission mode: 'client', 'cluster' or 'foreground'
- 'spark' or 'pyspark'
- working directory where enriched configuration files will be stored and where spark driver will stores it's file
- root directory of punchplatform.properties
- display executed command and display each punchline node to stdout as dataframe format
- disable dataframe coloring in case your terminal does not support this feature
- hide punch banner displayed on stdout
Launch a punchline using pyspark runtime:
sparkctl --punchline /tmp/punchline.hjson --runtime pyspark -v
sparkctl uses the library located :
The logging verbosity of sparkctl is controlled by the following two files:
sparkctl is used only internally and is not intended to be used by users. This shell is available on operator terminal environment but also on shiva and gateway nodes