Skip to content

Administration Tools

Abstract

The punch tooling provides you with various means to check the several resources and processings at play. This chapter provides a quick tour of the commands.

The PunchPlatform tooling is composed of :

  • standard tools coming from off the shelf software components (Zookeeper, Storm, Kafka, Ceph).
  • additional command-line tools provided with PunchPlatform environment, that ease usage of main , and allow for higher level of commands :
  • punchctl.sh for configuring/managing chains of user-defined data processing (see channel)
  • punchplatform-kafka-topics.sh for managing kafka topics (queues) in the PunchPlatform
  • punchplatform-objects-storage.sh for archiving/ceph data listing/managing

Shiva Task Manager

Shiva is a distributed and resilient job manager in charge of executing various kinds of jobs. It is used to run logstash, spark jobs and plans, or embedded storm topologies.

Refer to the Shiva chapter for details. Not that shiva does not provide a command line tool. It only starts task defined as part of a channel.

Kafka

kafka-topics.sh :

This is the standard tool provided by Kafka to manage topics<Kafka topic>. It is installed in Kafka setup \'bin\' directory.

punchplatform-kafka-topics.sh :

This is a wrapper for kafka-topics.sh (and for some other kafka function). It can be run from a LMC administration command-line environment See its man page for details on commands. Samples of useful commands are :

  • listing topics of a cluster (, in typical deployment) : :

    1
    punchplatform-kafka-topics.sh --kafkaCluster front --list
    
  • listing topics partitions status of a cluster ( or , in typical deployment) :

    1
    punchplatform-kafka-topics.sh --kafkaCluster front --describe
    
  • listing a topic partitions status of a cluster ( or , in typical deployment):

    This is useful to check if a broker is down/if a partition is not replicated :

    1
    punchplatform-kafka-topics.sh --kafkaCluster front --describe --topic rose. bluecoat_sg_510_dmz
    

    In this command output, the column (In Sync Replica) indicates the up-to-date replica which exist in the cluster. If there is only one number, the missing numbers (as compared to ) :

    1
    2
    3
    4
    kafka topic for kafka cluster 'front'...    Topic:rose.bluecoat_sg_510_dmz
    PartitionCount:2    ReplicationFactor:1 Configs:
        Topic: rose.bluecoat_sg_510_dmz Partition: 0    Leader: 2   Replicas: 2 Isr: 2
        Topic: rose.bluecoat_sg_510_dmz Partition: 1    Leader: 1   Replicas: 1 Isr: 1
    

kafka partition reassignment tool :

Refer to HOWTO_alter_existing_kafka_topics{.interpreted-text role="ref"} This standard Kafka tool requires a Java environment (manually set PATH to point to /data/opt/java.../bin, or type [source punchplatform-env.sh] in PunchPlatform command-line environment to use punchplatform environment variable and path.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
source punchplatform-env.sh
cat partitions-to-move.json
        {"partitions":
             [{"topic": "foo",
                    "partition": 1,
                    "replicas": [1,2,4] }],              
                }],
            "version":1
    }
    /data/opt/kafka/bin/kafka-reassign-partitions.sh --manual-assignment-json-file partitions-to-move.json --execute

Elasticsearch API requests

Listing indexes and their individual status :

1
2
3
curl elasticsearch_data_node:9200/_cat/indices?v
# or
curl elasticsearch_client_node:9100/admin/_cat/indices?v

Listing nodes registered in the Elasticsearch cluster, and their resource level (ram, load) :

1
2
3
curl elasticsearch_data_node:9200/_cat/nodes?v
# or
curl elasticsearch_client_node:9100/admin/_cat/nodes?v

all listed nodes are normally working elasticsearch nodes. missing nodes may be either - not running - running in another cluster (to assess this, you can issue a curl on node directly to the node on its production interface : esnode.prod:9200/_cat/nodes?v). This is a splitbrain case, which must be avoided by

1
2
3
4
5
6
7
> -   checking that network is ok between the  clusters
>     (i.e. curl on production IP from one to the other)
> -   restarting the  of the two, and checking that
>     it inserts itself into the larger cluster.
> -   Otherwise, stop the smalle cluster (to avoid data beeing
>     split between the clusters, depending on the load balancer
>     policy)
  • Closing an index (to free memory) :

    1
    2
    # YOU MUST USE THE ELASTICSEARCH CLIENT
    curl -XPOST elasticsearch_client_node:9100/admin/<indexName>/_close
    

    Notice that depending on its automatic actions settings, the housekeeping feature of the PunchPlatform admin service may automatically reopen manually closed indexes. To prevent this behavior, either forbid reopening in the admin service section of the configuration (and restart the service) or activate the manual override feature in the admin service section (and restart the service) and then manually force the state of the index in the admin service graphical user interface.

  • Listing actions (i.e. rebuilding of resilience after node failure) :

1
2
3
curl elasticsearch_data_node:9200/_cat/recovery?v
# or
curl elasticsearch_client_node:9100/admin/_cat/recovery?v

Full documentation for Elasticsearch 'CAT' web API.

  • Check the content of an index through the Elasticsearch API :
1
2
3
curl elasticsearch_data_node:9200/<index_name>/_search?pretty
# or
curl elasticsearch_client_node:9100/admin/<index_name>/_search?pretty

Storm User Interface

Using a browser, you can connect to the Storm-Ui GUI (usually on port 8080 on the same host as the nimbus/master of the cluster)

  • servers which are UP and running, and have declared themselves in the zookeeper folder of this storm cluster.

  • are the allowed number of Java virtual machines that can be run in the cluster (addition of allocated slots in each )
  • By clicking on , one can

    • the topology (if for some reason, it cannot be stopped using PunchPlatform command-line tool or admin GUI)
    • DO NOT USE feature, which is not supported in PunchPlatform yet
    • view uptime of the topology
    • view at a glance the number of failures on data handling, declared by topology components (i.e. problems which caused a non-ack /retry). If a component has failure, it is a good idea to have a look at the associated log
  • By clicking on a (Bolt/Spout) in the topology, one can

    • See host and port associated to each component instance. This is useful to find associated logs, which will be on this host, in /data/logs/storm/worker-\<port>.log
    • See latency generated by this component. If the number is high, some tuning and/or problem analysis is in order on this component (maybe activate some more trace and capture logs for back-office analysis?)

Zookeeper tools

zookeeper-console.sh

This is the standard tool provided by Zookeeper to connect/browse/change data inside zookeeper cluster. It is installed in Zookeeper setup \'bin\' directory. It requires a Java environment (manually set PATH to point to /data/opt/java.../bin, or type [source punchplatform-env.sh] in PunchPlatform command-line environment to use punchplatform environment variable and path.

punchplatform-zookeeper-console.sh :

This is a wrapper for zookeeper-console.sh. It can be run from a LMC administration command-line environment See its man page or builtin help for details on commands.

invoking this command without arguments will connect you to the \< cluster (unless it is unreachable, or down because more than half the nodes are down)

Samples of useful commands are :

  • ls \<path> to list path (starting at /)
  • get \<path> to view to data of a node (e.g. : get /punchplatformprod/kafka-back/brokers/ids/0)

Archiving Tools and Ceph tools

Please refer to object storage operator tips chapter.
Last Internet documentation for ceph open-source product is availableat http://docs.ceph.com/docs/master/#

An offline copy (at the time of release of this PunchPlatform software version) of this documentation is available here