Skip to content

Punchlet As a Function

Abstract

A punchlet is a function transforming some input data. It is meant to be deployed in a punchlet runtime engine typically part of a data processing pipeline.

Punchlets Explained

Consider a simple example: rhe following punchlet adds one field to whatever input data is given:

{ [user][age] = 22; }

This syntax is short and easy. Here is how it actually works. A punchlet is a Java class instance. Its actual signature is defined as follows:

/*
 * Your punchlet is an object implementing some interface. 
 * A punchlet is created and injected with some fields, in particular
 * resource JSON files, groks patterns, a so-called world Tuple to give your
 * Punch let access to the outside world,  etc .. . 
 * 
 * A punchlet must implement a single 'execute' function, the one that will be executed
 * with the input data.
 */
 class Punchlet {        
    /*
     * @param root : the root Tuple, containing the data (logs, events, whatever)
     */  
     void execute(Tuple root) {
         // your punchlet code
         root:[user][age] = 22;
     }
 }   

It is only to make it simpler that the Punch short notation is provided. When you write

{ [user][age] = 22; }

The Punch compiler will take care of filling the rest, and at runtime additional resources will be injected to your punchlet.

When you deploy that punchlet in a stream of data; it will be applied to each traversing data item.

Packaging Punchlets

You can deliver punchlet files in several ways.

The simplest:

{ [user][age] = 22; }
To add a test data in front and make it easier for user to simply try it, use the @test annotation:

// @test(fields=[user][name]) bob 
{ [user][age] = 22; }

A better way is to ship your punchlet as part of a yaml file. You can then add some description, more sophisticated input data as well as resources, an important punchlet feature described later on. Here is an example:

description: >
  Here is a simple example of the ipmatch operator.
  You want to know if an IP is in a list of ranges.
tests:
- logs:
    log: 172.16.0.2
- logs:
    log: 5.36.18.2
resources:
  ranges:
  - 10.0.0.0/8
  - 172.16.0.0/12
  - 192.168.0.0/16
  - 127.0.0.1/32
punchlet: >
  {
    // Retrieve our IP domains from a punch resource.
    // Resources comes from outside the punchlet. In this
    // example it is defined above but in production
    // it typically come from a S3 store, a remote filesystem
    // etc..
    Tuple ranges = getResourceTuple("ranges");

    // You can use that resource tuple like any other.
    [check] = ipmatch(ranges).contains([logs][log]);
  }
As you can see it makes a punchlet well documented

If you use the punch language to write log parsers, the punch provides an official maven based packaging that allows you to add unit tests and sample log test files in a well structured package. Refer to the online parser repositorie.