Skip to content

Channels

Abstract

This chapter explains how you assemble various applications into a channel.

Because it involves several distinct parts, a channel configuration consists in several files, all grouped in a common per tenant folder. As an example, here is the layout of one of the sample channels delivered as part of the PunchPlatform standalone installation. These channels are representative of a log management solution.

resources
    elastalert
          /** rules of elastalert **/
    elasticsearch
          /** elasticsearch mapping templates **/
    injector
          /** log injector to simulate logs **/
    kibana
          /** kibana dashboards **/
    punch
          /** the punchlets and associated resource files **/

 tenants
    mytenant
        channels
            sourcefire
                channel_structure.json
                input.json
        etc
            /** tenant configuration directory **/

Before explaining the content of each, first the big picture;

  • mytenant is the name of the tenant. These names are user defined, they have been chosen in the demo setup for illustrative purpose.
  • sourcefire is the name of a channel : i.e. all sourcefire equipment logs for this tenant will be handled by this channel. (This assumes of course that all such logs can be sent to that particular channel). This name is user-defined.
  • the resources/punch directory contains the punchlets. You can define punchlets for all tenants, as in this example, or define them only for a tenant or a channel. The PunchPlatform comes in with a complete set of example parsers, organized in several sub folders.

In a given channel directory, you find a [channel_structure.json] file and one or several other files. The [channel_structure.json] defines the structure of the channel, it can be composed of

  • stream applications : for example Spark streaming or Storm applications
  • batch application : typically spark PML jobs.
  • various tasks : applications you periodically execute for various functional purpose.

Channel Structure

Layout

The channel_structure.json file has the following structure:

{
    // version is usually the one of the main punch release.
    // This is to guarantee backward compatibility. 
    version : "6.0"

    // A channel can include several processing applications.
    applications : [
        { application }
    ]
    // an potentially some shared resources such as kafka topics
    resources : [
        { resource part }
    ]
}

Applications can be of several types: storm, spark, kubernetes or shiva. Whatever their type they all have the three following properties:

{
    // a unique name.
    name: myapplication

    // the name of the cluster in charge of running that job.
    // A cluster can be a Storm|Spark|Shiva or other cluster, 
    // as long as the corresponding cluster is defined in the 
    // punchplatform.properties configuration file.
    cluster: main

    // one of `none` or `kill_then_start`. 
    // Using `kill_then_start` makes this application ignores
    // channel reload order. This property is optional and
    // equal to 'kill_then_start' by default.
    reload: none

    // other properties
    ...
}

Note

the unique name of every application is <tenant_name>/<channel_name>/<cluster>/<application_name>.

Storm Application

A storm application is defined using a punchlines. These are ever-running streaming apps.

{
    // the name MUST refer to a local input.json or input.hson file
    // where the storm punchline is described.
    name: input

    type : storm

    // the name of your target storm cluster
    cluster: main

    reload_action: kill_then_start
}

Streaming Spark Applications

Streaming Spark applications are defined using a
spark punchline. For batch processing use cases, i.e. punchlines that eventually ends, submit it to shiva (see below).

Here is an example

    {
      type : spark

      // This must correspond to a local my_spark_app.hjson file
      name : my_spark_app

      // the name of a declared spark cluster
      cluster : main
    }
}

Shiva Applications

Shiva apps can be either ever-running stream apps, or batch apps that eventually terminates. You can request shiva to periodically relaunch your batch applications.

Here is an example to request shiva to start a logstash daemon.

{
    // the short name of the task. The task unique name
    // will appear as <tenant>_<channel>_<name>.
    name : my_shiva_job

    // the command must be an executable.
    command : logstash

    // your command arguments 
    args : [ "-f" , "logstash.yaml" ]

    // the associated resources, if any. Here you must provide
    // the logstash.yaml file. The accepted resources are files or
    // folders.
    resources : [
        "logstash.yaml"
    ]

    // the target shiva cluster. This key must be associated to a
    // shiva cluster as defined in your punchplatform.properties file.
    cluster : common

    // the tags to place your task to the shiva node you want.
    shiva_runner_tags : [ "standalone" ]

    // an optional cron expression should you require periodic
    // scheduling of your task. Here is an example to execute
    // it every 30 seconds
    // quartzcron_schedule : 0/30 * * * * ? *
}

Shiva is a lightweight distributed runtime engine. It will take care of running your application on one or several nodes in a robust and resilient way. Refer to the Shiva chapter

Using the punch and shiva, you can basically execute two kinds of applications :

  1. the ones provided and fully integrated by the punchplatform. These are described next.
  2. your own. Simply provide an executable command. Shiva will take care of executing it on the target servers. It is however your task to equip the target server(s) with the necessary environment to run your task.

Shiva Built In Application

Logstash

Logstash is fully integrated, as long as you selected the corresponding shiva deployment option.

Elastalert

You can run Elastalert rules using Shiva.

The corresponding command is elastalert. You must provide an Elastalert configuration and a rules folder or a single rule using the --rule Elastalert option

{
    runtime: shiva
    name: myjob
    command: elastalert
    args: [
        "--config", "myconfig.yaml",
        "--verbose"
    ]
    resources: [
        "rules/",
        "myconfig.yaml"
    ]
    cluster: common
    shiva_runner_tags: [ standalone ]
}

Take a look at Elastalert documentation

Spark Punchlines

You can run spark punchlines directly from Shiva. This is the simplest way to execute a spark job. Note that using shiva requires you to select the spark client deploy mode.

{
    type: shiva
    name: myjob
    command: punchline
    args: [
        "--punchline", "myapp.hjson",
        "--deploy-mode", "foreground",
        "--runtime", "spark"
    ]
    resources: [
        myapp.hjson
    ]
    cluster: common
    shiva_runner_tags: [ standalone ]
}

Note

Runtime value could be spark or pyspark

Plans

Plans are used to periodically run spark punchlines with advanced templating capabilities. Typically it is used to run spark application at specific time interval, and consuming specific ranges of timed data.

The corresponding command is plan. As you must provide both an application template and a plan configuration file, the example below use a folder

{
    type: shiva
    name: my_plan
    command: plan
    args: [
        "--plan", "plan.hjson",
        "--template", "myapp.template",
        "--deploy-mode", "foreground"
    ]
    resources: [
        plan.hjson
        myapp.template
    ]
    cluster: common
    shiva_runner_tags: [ standalone ]
}
Streaming Punchlines

In addition to running streaming punchlines in a Storm cluster, the punch also supports a lightweight single-process storm compatible engine. It can be used on small configurations to run punchlines without the cost of operating a storm cluster.

The corresponding shiva command is punchline.

{
    type: shiva
    name: my_streaming_punchline
    command: punchline
    args: [ "--mode", "light", "--punchline", "punchline.hjson" ]
    resources: [
        punchline.hjson
    ]
    cluster: common
    shiva_runner_tags: [ standalone ]
}

Warning

All resources are stored in zookeeper. For example, if you want to launch a punchline with some punchlets, you must specify your resources directory (here punchlets directory) into your job resources. If you update your punchlet, you must save it into zookeeper with punchplatform-putconf.sh shell.

Java Apps

Shiva can too execute an external jar application.

{
    type: shiva
    name: myjar
    command: java
    args: [ "-jar", "myjar.jar" ]
    resources: [
        myjar.jar
    ]
    cluster: common
    shiva_runner_tags: [ "standalone" ]
}

Resources

The resources part of the channel_structure.json file lets you define global resources potentially shared by your channel jobs. As of today only kafka topic resources are supported.

resources : [
    {
        type: kafka_topic

        name : mytenant_arkoon_output

        // the logical Kafka cluster name. A corresponding entry must appear
        // in your punchplatform.properties** file.
        cluster : local

        // the number of partitions for this topic. The more partitions, 
        //the more scalable is your channel.
        partitions : 4

        // the number of replica for each partition. 2 is a minimum to achieve
        // high-availability.
        replication_factor : 1
    }
]

Advanced Options

Channel structure

The channel structure file accept the following additional options.

{
    // true by default. If set to false, the channel will only start
    // if you use a specific per channel start command. It will not start
    // if you start all the tenant.  
    start_by_tenant : true

    // true by default. If set to false, the channel will not stop if you issue 
    // a tenant level stop command. Stopping it requires a dedicated per channel command. 
    // This is to prevent unwanted stop of important administrative or critical commands. 
    stop_by_tenant : true
}
JVM Options

You can override jvm options provided by defaut by the Punchplatform using args section:

{
    type: shiva
    name: mytopomyplanlogy
    command: plan
    args: [
        "--plan", "plan.hjson",
        "--template", "myapp.template",
        "--deploy-mode", "foreground",
        "--childopts", "-Xms256m -Xmx512m"
    ]
    resources: [
        plan.hjson
        myapp.template
    ]
    cluster: common
    shiva_runner_tags: [ standalone ]
}

Note

This option is only available for commands : punchline, plan, platform-monitoring, channels-monitoring and archives-housekeeping