Skip to content

Plans

Batch punchlines are commonly used to perform aggregations, comput some models, in short do useful things but require to be scheduled periodically.

Plans are the building block provided by the punch to let you define how often you want to run a punchline and more importantly how to update your input data time range so that every time your punchline runs, it process the right input dataset.

Think of plan as a mean to say : "every hour exectue that punchline over the last 4 hours of data" for example.

In this quick tour we will have a look at the simplest possible plan.

cd $PUNCHPLATFORM_CONF_DIR/samples/plans/basic

There you have a plan and the punchline template. Here is the punchline template content:

{
    channel: channelexample
    runtime: spark
    version: "6.0"
    tenant: mytenant
    dag: [
        {
            type: dataset_generator
            component: input
            settings: {
                input_data: [
                    {
                        name: from_date
                        date: "{{ from }}"
                    }
                    {
                        name: to_date
                        date: "{{ to }}"
                    }
                ]
            }
            publish: [
                {
                    stream: data
                }
            ]
        }
        {
            type: show
            component: show
            settings: {
                truncate: false
            }
            subscribe: [
                {
                    component: input
                    stream: data
                }
            ]
        }
    ]
}

This very simple punchline simply print to generated columns that will hold the provided dates. You guessed it, the plan itself will be in charge of providing these. Here is the plan configuration file plan.hjson:

{
  tenant: mytenant
  version: "6.0"
  channel: channelexample
  name: planname
  model:{
    dates: {
       from: {
         offset: -PT1m
         format: yyyy-MM-dd'T'HH:mmZ
       }
       to: {
         format: yyyy-MM-dd'T'HH:mmZ
       }
    }
  }
  plan_settings: {
    cron: "*/1 * * * *"
  }
}

The PT1m is a standard way to express "now minus one minute". As for the cron expression it means : run the punchline every

planctl start --plan plan.hjson --template punchline.template

You will see the result of the spark punchline printed every minute. Check in particular the dataset column there you have your dates. This simple mechanism is, in fact, very powerful. You can use lots of templated variable to consume the dataset you need from the indexes or data source you need. Everything can be templatized.

Important

Punch plans benefits from many production-required features such as persitent cursor, high-availability, monitoring. These are described in the reference guide.