The PunchOperator is a set of Kubernetes Operators that manages pipelines, abstracted with kubernetes CRD concept :
- Stormline : Punchline in Storm runtime for stream use cases.
- Sparkline : Punchline in Spark runtime for batch use cases.
- Flinkline : Punchline in Flink runtime for stream and batch use cases.
- Application : Generic Application such as Logstash.
A resilient cron-like scheduler is also provided. The scheduler can run batch pipeline iteratively in time.
- Plan : Wrapper to schedule our supported pipeline CR (see above) like a CRON job.
Because these pipelines are to be executed in critical environment and may consume sensitive resources during it's lifecycle, we introduced a CRD to tackle this issue.
- Platform : Platform information to enrich pipeline CR through annotations with sensitive information (secrets, host:port, ...)
If you need resources, such as external jars or pexs, to be included in your punchline at runtime, include them under
dependencies key, using the
maven packaging syntax :
spec: dependencies: - punch-parsers:org.thales.punch:punch-websense-parsers:1.0.0 - punch-parsers:org.thales.punch:common-punchlets:4.0.2 - file:org.thales.punch:geoip-resources:1.0.1 - additional-pyspark-pex:org.thales.punch:punch-ia-functions:1.1.2
The Punchplatform Operator will instantiate an InitContainer based on
resourcectl image. This container will download
the specified dependencies, extract and mount them. The downloaded resources will be located in
/opt/punch on the
Start Punch CRD¶
kubectl apply -f /path/to/punchline.yaml
Stop Punch CRD¶
kubectl delete -f /path/to/punchline.yaml
List Punch CRD¶
kubectl get stormlines kubectl get sparklines kubectl get flinklines kubectl get plans kubectl get platforms