Skip to content

HOWTO deploy an external library

Why do that ?

  • You may have developed a custom node for Spark/Pyspark or Storm in Java or Python.
  • You may have developed a custom application for Shiva in Java or Python. After having made conclusive tests you want to deploy it in your running platform.

  • You want to import a external library for your use case and add it in your classpath when submitting a punchline to Spark, Pyspark, Storm or Shiva.

Compatible components

The following components support additional libraries :

Deployment Procedure

The punch makes it easy and safe to add an external library to a production platform.

1. Prerequisite

You must have your external library available on your deployer server (your custom node jar or pex file for example)

2. Preparation

Next update the deployer folder. That folder is project specific and located on the deployer server (i.e. the server from where you deployed your platform).

Create, if not already done, the following folder associated with the component to enrich :

cd <deployer_path>
$ mkdir -p archives/extlib/storm
or
cd <deployer_path>
$ mkdir -p archives/extlib/spark
or
cd <deployer_path>
$ mkdir -p archives/extlib/pyspark
or
cd <deployer_path>
$ mkdir -p archives/extlib/shiva

Copy in that folder your external library. Create a git tag before deploying.

Next, still on the same deployer server, go to $PUNCHPLATFORM_CONF_DIR and tag the version :

cd $PUNCHPLATFORM_CONF_DIR
git pull
git tag -a v<CURRENT_VERSION> -m "before adding new library <MY_LIBRARY>"
git push --tags

Generate the deployment configuration with the following command. these files has no impact of the running platform, it is only a local operation.

punchplatform-deployer.sh --generate-inventory

That command regenerates the various inventories and variables used by ansible to deploy the punch components. This command has no effect on the running platform. If errors are reported, please check the punchplatform-deployment.settings configuration, and contact the punch support.

Push the new configuration. This operation has no impact on the running platform.

git push

Important

Before going to the next step, make sure you communicated your procedure to the platform stakeholders.

3. Deployment

Deploy the library to all target servers. Depending on what you want to update (storm, spark, pyspark or shiva), execute :

$ punchplatform-deployer.sh --deploy -Kk -t storm
or:
$ punchplatform-deployer.sh --deploy -Kk -t spark
or:
$ punchplatform-deployer.sh --deploy -Kk -t shiva

Next test the new library by testing your use case.

Tip

If errors appears, no panic! Rollback your environment with the old deployer (rollback deployment environment variable and redeploy operators )

4. Final Checks

Tag the end of migration from the operator device:

cd $PUNCHPLATFORM_CONF_DIR
git pull
git tag -a v<NEW_VERSION> -m "end of migration to <NEW_VERSION>"
git push --tags

5. Rollback procedure

Remove the external libraries directory on target nodes. Depending on what you have updated (storm, spark or pyspark), execute :

$ punchplatform-deployer.sh --ssh storm_servers "rm -rf {{install_root}}/{{binaries_version}}/extlib/storm/*"
or:
$ punchplatform-deployer.sh --ssh spark_servers "rm -rf {{install_root}}/{{binaries_version}}/extlib/spark/*"
or:
$ punchplatform-deployer.sh --ssh spark_servers "rm -rf {{install_root}}/{{binaries_version}}/extlib/pyspark/*"
or:
$ punchplatform-deployer.sh --ssh spark_servers "rm -rf {{install_root}}/{{binaries_version}}/extlib/shiva/*"

Then update external libraries directory on the deployment device with the previous libraries or remove existing libraries and redeploy :

$ punchplatform-deployer.sh --deploy -Kk -t storm
or:
$ punchplatform-deployer.sh --deploy -Kk -t spark
or:
$ punchplatform-deployer.sh --deploy -Kk -t shiva