Skip to content

Frequently Asked Questions

Craig V6 versus Ella V7 Questions

Are all V6 features supported on V7 ?

No. The first Ella release planned for September 2021 will provide complete support for all punchlines variants and plans. However the tenant and channel concept will not be supported as is.

That basically means you will be able to run all punch V6 plan or punchline using a kubectl command but you will not benefit from the punch operator channelctl operator command.

This said we plan to fill that gap by integrating a GitOps architecture to achieve the same goal. Checkout out the roadmap.

Is Shiva still used in Ella V7 ?

No. Shiva is fully replaced by Kubernetes.

Will V6 punchlines be compatible with V7

Yes. All punch spark storm and native punchlines are packaged as container image. Inside the nodes and code are compatible.

Backward compatibility explanations will be provided as part of the official release note.

Punch versus Kast

Is Punch part of Kast ?

No. The punch consists in a few Kube native applications and services that run on top of any Kubernetes including Kast. Kast is the officially supported Kubernetes platform, in particular for on premise deployments.

Note that the Punch leverage Kast flink pipelines. You can start these pipelines as well as punch spark or native pipelines directly on Kast or Kubernetes using kubectl and helm commands.

The punch provides a higher level KLubernetes operator that makes it easier to define such pipelines using kubernetes CRDs. The punch also provides a REST API to ease the runtime dependencies from punchlines to external resources (such as parsers, custom python or java nodes).

These operator abd REST services are not integrated in the Kast packages but are of course available if you want to try it.

Punch on Kubernetes

Does Punch require Kubernetes ?

Yes. We do not plan to support other orchestrators. The Punch new design fully embraces Kubernetes and there are no plan to maintain a non-kubernetes alternative.

Can I expose a TCP/UDP port outside the cluster ?

Yes. One of the solutions, and maybe the simpliest, is to use a NodePort Service. This is basically a Kube Service exposing a port on nodes and redirecting the input to a pod's port.

You need to set :

  • The targeted port on your pod.
  • A selector pointing to your pod, i.e one or more of your pod's labels.

To ease organizing your services it's recommended to :

  • Include your punchline name in your service name.
  • name your port config after the node that uses it.

Example :

apiVersion: v1
kind: Service
metadata:
  name: my-punchline-service
spec:
  type: NodePort
ports:
  - name: sysloginput
    protocol: TCP
    port: 5051
    targetPort: 5051
selector:
  app: my-punchline

Should I Use ReplicaSet ?

No. Use Deployment instead.

Deployment is an object which can own ReplicaSets and update them and their Pods via declarative, server-side rolling updates. While ReplicaSets can be used independently, today they're mainly used by Deployments as a mechanism to orchestrate Pod creation, deletion and updates. When you use Deployments you don't have to worry about managing the ReplicaSets that they create. Deployments own and manage their ReplicaSets. As such, it is recommended to use Deployments when you want ReplicaSets

How can I push my elasticsearch mappings ?

First of all, a quick reminder : punch provided mappings should be pushed in the deployment process. That being said, you may have to push some mappings of your own. To do so, follow this documentation.

Batch/Stream punchlines monitoring with prometheus

Batch

In case you are using batch pipelines (long running/short live), use our .spec.reporters[].type: pushgateway. Using prometheus scrapping can result in metrics loss for short lived pipelines. E.g sparklines

(truncated)
spec:
  (truncated)
  reporters:
  - type: kafka
    bootstrap.servers:
    - localhost:9092
    topic: topicName
    reporting_interval: 30
    encoding: lumberjack

Stream

If you are using stream pipelines (never ending pipeline), use our .spec.metrics.port: <PORT_NUMBER>. This will spwaned a servlet on which prometheus can scap metrics.

Spark Metrics with prometheus

Spark does not provide and will never provide native support to prometheus pushgateway feature. This is because they don't want to bring external dependencies in their spark-core project. See: Issue 1, Issue 2, Issue 3

But they provide native support for prometheus scrapping.

For a sparkline:

(truncated)
metadata:
  (truncated)
  annotations:
    (truncated)
    prometheus.io/path: /metrics/prometheus
    prometheus.io/scrape: true
    prometheus.io/port: 4040
spec:
  (truncated)
  settings:
    spark.ui.prometheus.enabled: true
    spark.eventLog.logStageExecutorMetrics: true
    spark.metrics.conf.*.sink.prometheusServlet.class: org.apache.spark.metrics.sink.PrometheusServlet
    spark.metrics.conf.*.sink.prometheusServlet.path: /metrics/prometheus
Scrapping on the same pod multiples endpoints with prometheus

In some cases, you may want to scrap multiple endpoints that exists on the same pod.

For instance, see Spark Metrics, that exposes 3 endpoints for prometheus to scrap on:

  • /metrics/applications/prometheus/
  • /metrics/master/prometheus/
  • /metrics/prometheus/

For this use-case, you might want to use:

Prometheus Service Monitor CRD