Punchlet As a Function¶
Abstract
A punchlet is a function transforming some input data. It is meant to be deployed in a punchlet runtime engine typically part of a data processing pipeline.
Punchlets Explained¶
Consider a simple example: rhe following punchlet adds one field to whatever input data is given:
{ [user][age] = 22; }
This syntax is short and easy. Here is how it actually works. A punchlet is a Java class instance. Its actual signature is defined as follows:
/*
 * Your punchlet is an object implementing some interface. 
 * A punchlet is created and injected with some fields, in particular
 * resource JSON files, groks patterns, a so-called world Tuple to give your
 * Punch let access to the outside world,  etc .. . 
 * 
 * A punchlet must implement a single 'execute' function, the one that will be executed
 * with the input data.
 */
 class Punchlet {        
    /*
     * @param root : the root Tuple, containing the data (logs, events, whatever)
     */  
     void execute(Tuple root) {
         // your punchlet code
         root:[user][age] = 22;
     }
 }   
It is only to make it simpler that the Punch short notation is provided. When you write
{ [user][age] = 22; }
The Punch compiler will take care of filling the rest, and at runtime additional resources will be injected to your punchlet.
When you deploy that punchlet in a stream of data; it will be applied to each traversing data item.
Packaging Punchlets¶
You can deliver punchlet files in several ways.
The simplest:
{ [user][age] = 22; }
// @test(fields=[user][name]) bob 
{ [user][age] = 22; }
A better way is to ship your punchlet as part of a yaml file. You can then
add some description, more sophisticated input data as well as resources, an important
punchlet feature described later on. Here is an example: 
description: >
  Here is a simple example of the ipmatch operator.
  You want to know if an IP is in a list of ranges.
tests:
- logs:
    log: 172.16.0.2
- logs:
    log: 5.36.18.2
resources:
  ranges:
  - 10.0.0.0/8
  - 172.16.0.0/12
  - 192.168.0.0/16
  - 127.0.0.1/32
punchlet: >
  {
    // Retrieve our IP domains from a punch resource.
    // Resources comes from outside the punchlet. In this
    // example it is defined above but in production
    // it typically come from a S3 store, a remote filesystem
    // etc..
    Tuple ranges = getResourceTuple("ranges");
    // You can use that resource tuple like any other.
    [check] = ipmatch(ranges).contains([logs][log]);
  }
If you use the punch language to write log parsers, the punch provides an official maven based packaging that allows you to add unit tests and sample log test files in a well structured package. Refer to the online parser repositorie.