Skip to content

Plans

Plans are special punch applications that periodically execute punchlines. Refer to the plan concept overview chapter to have a quick understanding.

Batch punchlines are commonly used to perform aggregations, compute some models, in short do useful things but require to be scheduled periodically.

Plans are the building block provided by the punch to let you define how often you want to run a batch punchline and more importantly how to update your input data time ranges so that every time your punchline runs, it processes the right input dataset.

Think of plan as a mean to say : "every hour execute that batch punchline over the last 4 hours of data" for example.

Configuration

In this quick tour we will have a look at the simplest possible plan.

cd $PUNCHPLATFORM_CONF_DIR/samples/plans/basic

There you have a plan and the punchline template. Have a look at their content. Here is the plan:

---
version: '6.0'
model:
  dates:
    from:
      offset: -PT1m
      format: yyyy-MM-dd'T'HH:mmZ
    to:
      format: yyyy-MM-dd'T'HH:mmZ
settings:
  cron: '*/1 * * * *'

A plan (mainly) is in charge on generating dates. Dates are then consumed by a punchline.

Dates templating

The PT1m is a standard way to express "now minus one minute". As for the cron expression it means : run the punchline every minute.

Here is the punchline template file content:

---
type: punchline
runtime: spark
version: '6.0'
dag:
- settings:
    input_data:
    - date: "{{ from }}"
      name: from_date
    - date: "{{ to }}"
      name: to_date
  component: input
  publish:
  - stream: data
  type: dataset_generator
- settings:
    truncate: false
  component: show
  subscribe:
  - component: input
    stream: data
  type: show

That is a simple punchline that prints to stdout the input node generated columns. These columns contain the dates generated by the plan.

Try it

To start this plan here is the command line:

planctl start --plan plan.yaml --template template.yaml

You will see the result of the spark punchline printed every minute. Check in particular the dataset column, there you have your dates. Every punchline property can be templatized using dates or other values. This simple mechanism is thus very powerful.

Important

Punch plans benefit from many production grade features such as persitent cursors, high-availability, monitoring. These are described in the reference guide plan chapter.

Plan Alternatives ?

Plans are both simple and powerful. A number of well-known technologies provide the same sort of resilient scheduling power. For example apach AirFlow, of Kubernetes Argo Workflow. These technologies are way way more complex and heavyweight. The beauty of punch plans is that they run on tiny, small medium or larg-scale platforms.