Skip to content

Plan

Plan CRD instances are managed by the Plan Operator

Plan instances enable cron-like scheduling of batch pipelines:

  • sparkline
  • flinkline
  • stormline
  • application

Note

When plan is combined with stormline, flinkline or application, .spec.templateSpec.oneshot is forced to true

...but while being highly resilient

Plan Operator lifecycle

Note

Only the core loop is described below and this is not the complete lifecycle of a plan instance

A plan instance can go through five different phases, similarly to Pod Phases: Pending, Running, Succeeded, Failed and Unknown.

  • When a Plan CRD is submitted to kubernetes, plan Status is empty.
  • The Punch Operator will catch the submitted event and update Status to Pending
  • It will then schedule the required child (stormline, sparkline, application...)
  • When the Schedule time is reached, the operator run templating using the provided dates, create the required child and launch it. Status is updated as Running.
  • Punch Operator will wait until next schedule or next event and check the child status. If it's neither Failed or Succeeding, it will continue to wait.
  • If child Status is Succeeded or Failed, the operator will store this job information in the PreviousSchedule key, and plan the next schedule.
  • If child is Succeeded, the operator increment the schedules. Next become Current and the new Next is calculated based on the interval or cron.
  • If child is Failed, the operator will instantly retry the same schedule.
  • As long as the current child is not Succeeded, the operator will not trigger the next scheduling event of the plan instance.

  • If .spec.maxIteration is set, Plan Operator will stop scheduling new iteration when the value is reached

  • If .spec.maxIteration is not set, Plan Operator will repeat the whole process in a never ending for loop

Warning

The plan will set a finalizer on your child instance. This is to ensure the child is not deleted when the plan is getting its status. Once the plan has seen that the child is completed, the finalizer is removed.

Mutating/Validating webhooks

Using webhooks with Plan instances

Note

At this moment, there are no webhooks dedicated for Plan instances

When .metadata.annotations.platform.gitlab.thalesdigital.io/platform: <PLATFORM_CRD_INSTANCE_NAME> is defined, the annotation will be propagated to it's sub-resources .meta.annotations:

  • sparkline
  • stormline
  • flinkline
  • application

In this scenario, the default mutating/validating webhook will be triggered automatically.

Configuration

Note

.metadata.labels and .metadata.annotations are propagated to child resources, punchlines, which also propagate those metadata to pods

Native kubernetes fields

Fields such as:

  • .apiVersion
  • .kind
  • .metadata

...are common fields, part of kubernetes terminology.

apiVersion: scheduler.gitlab.thalesdigital.io/v1
kind: Plan
metadata:
  name: plan-sample
...

.metadata field is propagated to all the plan instance sub-resources.

Customizing an instance based on .spec field

Note

Official golang time module is used for parsing dates and calculating durations, see: https://github.com/golang/go/tree/master/src/time

Formatting dates in golang requires real dates in text. Those dates are constants and are not arbitrary, e.g. "2006-01-02T15:04:05-0700"

Note

Official golang templating module is used for applying templating capabilities on .spec.templateSpec, see: https://github.com/golang/go/tree/master/src/html/template

...
spec:
  # plan will use punchline version v1
  apiVersion: punchline.gitlab.thalesdigital.io/v1
  # plan will be executing a sparkline pipeline (can be Sparkline, Flinkline, Application, Stormline)
  kind: Sparkline
  # note: specifying cron and interval will result in cron overriding interval
  # only interval or cron should be set at a time
  #interval: 10s
  cron: "* * * * *"
  dates:
    # generate a "from" variable, used by the templating context
    from:
      offset: "-10h45m"
      format: "2006-01-02T15:04:05-0700"
    # generate a "to" variable, used by the templating context
    to:
      offset: "10h45m"
      format: "2006-01-02T15:04:05-0700"
  # dates generated by the plan instance will be used as templating
  # variable on .spec.templateSpec
  # .spec.templateSpec is the full application, flinkline, sparkline, stormline .spec configuration
  templateSpec:
    image: ghcr.io/punchplatform/sparkline:7.0.1-SNAPSHOT
    imagePullPolicy: IfNotPresent
    serviceAccount: admin-user
    garbageCollect: false
    initContainerImage: ghcr.io/punchplatform/resourcectl:7.0.1-SNAPSHOT
    implementation: java
    settings:
      spark.executor.instances: "1"
      spark.kubernetes.authenticate.driver.serviceAccountName: admin-user
    punchline:
      dag:
        - settings:
            input_data:
              - date: "{{ .from }}"
                name: from_date
              - date: "{{ .to }}"
                name: to_date
          component: input
          publish:
            - stream: data
          type: dataset_generator
        - settings:
            truncate: false
          component: show
          subscribe:
            - component: input
              stream: data
          type: show

Example(s)

Standard / Sparkline Java / Dataset Generator to Stdout

{ !../examples/samples/punk/plan/plan_sparkline_sample.yaml! }

Standard / Application / helloworld to Stdout

{ !../examples/samples/punk/plan/plan_application_sample.yaml! }