Skip to content

Services

At first glance, a service looks like to a channel. Whereas a channel groups several functional units (i.e. storm topologies, spark jobs), a service groups several administrative or monitoring tasks.

Note

For clarity we will refer to these as tasks in contrast to jobs or topologies or pipelines that we use to refer to business logic processings.

The punchplatform ships in with three ready-to-use services. These are :

  • kafka topic monitoring task : in charge of monitoring the lag of all kafka topics, per consumer group
  • elasticsearch housekeeping task : in charge of cleaning the old elasticsearch data
  • archive housekeeping task : in charge of cleaning the expired archive data

You can easily extend these with your own, we will see that shortly after.

Tasks Scheduling

Administrative tasks are critical and must be executed even if some cluster node goes down. Where they are running is actually not important, as long as they are executed regularly. This is illustrated next:

image

And here comes one of the punchplatform component into play, Shiva. Shiva is a robust, lightweight and resilient task scheduler. Refer to the shiva chapter for an overview.

This said, a service is really just like a channel. You have a service_structure.json file that lists the various tasks to schedule. If you start your service, the tasks are scheduled.

Here is an service_structure.json file example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
{
    "shiva_tasks" : [  
        { 
            "name" : "archives_housekeeping_service", 
            "cmd_file" : "archives_housekeeping_service.sh",
            "cluster" : "common",
            "shiva_runner_tags" : ["standalone"],
            "quartzcron_schedule" : "0 0 * ? * * *"
        }, 
        {
            "name" : "elasticsearch_housekeeping_service",
            "cmd_file" : "elasticsearch_housekeeping_service.sh",
            "cluster" : "common",
            "shiva_runner_tags" : ["standalone"],
            "quartzcron_schedule" : "0 0 * ? * * *"
        }
    ]
}

Just like for channels, service are defined in a directory with the following layout:

1
2
3
4
5
6
tenants/mytenant
└── services
    └── admin
        ├── service_structure.sh
        ├── archives_housekeeping_service.sh
        └── elasticsearch_housekeeping_service.sh

And just like you have a punchplatform-channel.sh command to start or stop a channel, you have a punchplatform-service.sh command to you start & stop a services

1
2
$ punchplatform-service.sh --start <tenant>/<service_name>
$ punchplatform-service.sh --stop <tenant>/<service_name>