Skip to content

Overview

Abstract

The punchplatform provides its users with very few concepts to understand and a minimal set of commands to learn. This chapter provides a quick overview of everything you need to understand. Dedicated chapters provides more detailed informations on each topic.

Configuration Management

Users interact with the platform by defining and updating the configuration files, and by issuing commands. They do this using one or two modes.

Terminal Command Line

The punch provides a unix terminal environment to allow users to operate their platform. This is depicted next where the yellow server is the one where the operator environment has been setup.

image

The punchctl command makes it easy to start stop channels. It provides an interactive mode with advanced completion capabilities. It also provides non interactive command. This is described in commands.

On that server the configuration folder just described is stored in a regular unix folder.

Graphical User Interface

The punch also provides a graphical user interface implemented as a Kibana plugin. It is depicted next where the gray server has been setup with that plugin.

image

This plugin is mostly used on development and standalone platforms. It provides advanced development functions such as the machine learning studio and the online punch testers.

Production Features

Backup Restore

Production systems require additional configuration management features. One is to save your configurations. The punch provides commands to save the platform and tenant configuration to the internal zookeeper database. It is as simple as executing the following command to (say) save all a tenant (here 'mytenant') configuration.

1
punchctl --tenant mytenant configuration --push

To restore the configuration simply execute.

1
punchctl --tenant mytenant configuration --pull

Change History

Keeping track of all the configuration changes is yet another key capability. This is provided by git. It is strongly advised to setup a git infrastructure so as to allow users to synchronise, share and potentially rollback their configuration changes.

Refer to the separated git setup guide for setup instructions.

Operating your Platform Status

This short chapter provides the essential information to run and operate your channels.

Using the start command described here, the punch will automatically submit and run your jobs to the target (Storm|Spark|Shiva) clusters.

The status of your channels, tenants and platform are provided two you through command line or REST apis. Let us go through the various ways you can have a precise view of the runtime status of your platform.

Channels Status

The runtime status of channels is one of the following.

 State  Meaning
ACTIVE  the channel is running in nominal condition.
STOPPED  the channel is stopped.
PARTIAL  this is a non nominal situation. One of your channel or service sub task is not running anymore, most probably because an operator stopped it using an external or cli command
OUT_OF_SYNC  this is a non nominal situation. It indicates that you updated the configuration of a channel, yet without restarting it. I.e. what is running is not what you expect.

Tenant Status

If all channels from a tenants are active, the tenant itself is considered running in nominal condition.

Platform Status

The platform status provides you with a quick view of the runtime status of all your platform components: elasticsearch, spark, storm, zookeeper etc..

REST APIs

The punch stores the various channel tenant and platform metrics and status in Elasticsearch. In turn this provides a REST API to external monitoring tool to easily check the overall status of the platform.

In addition it lets you design Kibana dashboards to visualize everything you need to quickly understand your platform behavior, and pin point issues should you suffer from failures.

Administrative Logging

All platform actions are recorded in a daily Elasticsearch index platform-logs-*. In particular the operator actions are recorded. Here is the log record format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
{
    # The content section contains the actual log. It can be:
    # - a log from punchtcl to trace the channel and job start stop commands
    # - a log from Shiva leader or workers to trace the shiva job start stop and 
    #     assignements events
    # - logs from Shiva jobs (i.e. subprocess)
    "content": {
        # level can be INFO WARN or ERROR.
        # ERRORs should be carefully watched for. 
        "level": "INFO",

        # the actual log message. 
        "message": "job started"

        # only for shiva jobs, the actual args command
        "args": "[/opt/data/punchplatform-standalone-5.2.0/external/punchplatform-shiva-5.2.0/features/commands/logstash, -f, logstash.conf]",

        # whenever available the logger that generated the log. This
        # can be the child logger or the shiva worker logger. 
        "logger": "org.thales.punch.apps.shiva.worker.impl.WorkerTask",
    },
    # the target section relates to the target runtime environment.
    # I.e. a Storm|Shiva|Spark cluster in charge of the related event.
    "target": {
      "cluster": "main",
      "type": "storm"
    },
    # the init section relates to the event originator. 
    "init": {
      "process": {
          # the application name that generated the event
          "name": "punchctl",
          # a runtime id whenever available
          "id": "56605@server1.thales.com"
      },
      "host": {
        # the host name where the event was issued
        "name": "server1"
      },
      "user": {
          # the (unix) user name owning the application.
          # It can be a platform operator or a daemon user depending
          # on the application generating the event.
         "name": "operator"
      }
    },
    # the tenant
    "tenant": "mytenant",

    # the type is always 'punch'.
    "type": "punch",

    # the vendor is always 'thales'
    "vendor": "thales",

    # the channel name
    "channel": "logstash",

    # the job name, if available
    "job": "logstash",

    # the generation timestamp
    "@timestamp": "2019-05-16T03:54:36.949Z"

The important normalized message values are the following:

Message Description
"job started" indicate the start of a channel job. These messages are generated by the punchctl and shiva_worker applications.
"job stopped" indicate the stop of a channel job. These messages are generated by the punchctl and shiva_worker applications.
"assigned job to worker" Generated by the shiva_leader application. Indicate the assignement of a job to a shiva worker.
"scheduled ever running job" Generated by a shiva_worker application. Indicate the scheduling of a continuously running shiva job.
"scheduled periodic job" Generated by a shiva_worker application. Indicate the scheduling of a periodic shiva job. Typical a pml plan or anyother job scheduled using a cron quartz schedule parameter.
"starting job on worker" Generated by a shiva_worker application. Indicate the effective start of a shiva job.