Frequently Asked Questions¶
Craig V6 versus Ella V7 Questions¶
Are all V6 features supported on V7 ?
No. The first Ella release planned for September 2021 will provide complete support
for all punchlines variants and plans. However the
will not be supported as is.
That basically means you will be able to run all punch V6 plan or punchline using a
command but you will not benefit from the punch operator
channelctl operator command.
This said we plan to fill that gap by integrating a GitOps architecture to achieve the same goal. Checkout out the roadmap.
Is Shiva still used in Ella V7 ?
No. Shiva is fully replaced by Kubernetes.
Will V6 punchlines be compatible with V7
Yes. All punch spark storm and native punchlines are packaged as container image. Inside the nodes and code are compatible.
Backward compatibility explanations will be provided as part of the official release note.
Punch versus Kast¶
Is Punch part of Kast ?
No. The punch consists in a few Kube native applications and services that run on top of any Kubernetes including Kast. Kast is the officially supported Kubernetes platform, in particular for on premise deployments.
Note that the Punch leverage Kast flink pipelines. You can start these pipelines as well as punch spark or native pipelines directly on Kast or Kubernetes using kubectl and helm commands.
The punch provides a higher level KLubernetes operator that makes it easier to define such pipelines using kubernetes CRDs. The punch also provides a REST API to ease the runtime dependencies from punchlines to external resources (such as parsers, custom python or java nodes).
These operator abd REST services are not integrated in the Kast packages but are of course available if you want to try it.
Punch on Kubernetes¶
Does Punch require Kubernetes ?
Yes. We do not plan to support other orchestrators. The Punch new design fully embraces Kubernetes and there are no plan to maintain a non-kubernetes alternative.
Can I expose a TCP/UDP port outside the cluster ?
Yes. One of the solutions, and maybe the simpliest, is to use a NodePort Service. This is basically a Kube Service exposing a port on nodes and redirecting the input to a pod's port.
You need to set :
- The targeted port on your pod.
- A selector pointing to your pod, i.e one or more of your pod's labels.
To ease organizing your services it's recommended to :
- Include your punchline name in your service name.
- name your port config after the node that uses it.
apiVersion: v1 kind: Service metadata: name: my-punchline-service spec: type: NodePort ports: - name: sysloginput protocol: TCP port: 5051 targetPort: 5051 selector: app: my-punchline
Should I Use ReplicaSet ?
Deployment is an object which can own ReplicaSets and update them and their Pods via declarative, server-side rolling updates. While ReplicaSets can be used independently, today they're mainly used by Deployments as a mechanism to orchestrate Pod creation, deletion and updates. When you use Deployments you don't have to worry about managing the ReplicaSets that they create. Deployments own and manage their ReplicaSets. As such, it is recommended to use Deployments when you want ReplicaSets
How can I push my elasticsearch mappings ?
First of all, a quick reminder : punch provided mappings should be pushed in the deployment process. That being said, you may have to push some mappings of your own. To do so, follow this documentation.
Batch/Stream punchlines monitoring with prometheus
In case you are using batch pipelines (long running/short live), use our
prometheus scrapping can result in metrics loss for short lived pipelines. E.g
(truncated) spec: (truncated) reporters: - type: kafka bootstrap.servers: - localhost:9092 topic: topicName reporting_interval: 30 encoding: lumberjack
If you are using stream pipelines (never ending pipeline), use our
This will spwaned a servlet on which prometheus can scap metrics.
Spark Metrics with prometheus
Spark does not provide and will never provide native support to prometheus pushgateway feature. This is because they don't want to bring external dependencies in their spark-core project. See: Issue 1, Issue 2, Issue 3
But they provide native support for prometheus scrapping.
(truncated) metadata: (truncated) annotations: (truncated) prometheus.io/path: /metrics/prometheus prometheus.io/scrape: true prometheus.io/port: 4040 spec: (truncated) settings: spark.ui.prometheus.enabled: true spark.eventLog.logStageExecutorMetrics: true spark.metrics.conf.*.sink.prometheusServlet.class: org.apache.spark.metrics.sink.PrometheusServlet spark.metrics.conf.*.sink.prometheusServlet.path: /metrics/prometheus
Scrapping on the same pod multiples endpoints with prometheus
In some cases, you may want to scrap multiple endpoints that exists on the same pod.
For instance, see Spark Metrics, that exposes 3 endpoints for prometheus to scrap on:
For this use-case, you might want to use: