Skip to content

HOWTO expand a kafka cluster

Why do that

On a mutualized or dedicated platform, you may want to increase the capacity of the platform. For a kafka cluster, it can be to :

  • increase the retention time
  • increase the EPS rate

Prerequisites

  • To have access to the deployment device (Ansible VM)
  • To have the new kafka devices ready for production (repository, update, DNS, LDAP etc...)
  • If you are in production, you must have perform a RFC

What to do

  1. Check the current deployment from the deployment device.
punchplatform-deployer.sh --deploy -Kk --check --diff -t kafka -l kafka_servers

Note

You can also limit the deployment to a specific kafka_cluster by using kafka- instead of kafka_servers

  1. Create a tag to the configuration before the extension.
cd $PUNCHPLATFORM_CONF_DIR
git tag -a vX.Y -m "X.Y before extension"
  1. Update PunchPlatform properties with the new kafka devices (brokers).
"kafka" : {
    "clusters" : {
        "local" : {
            "brokers" : ["node01:9092", "node02:9092", "node03:9092"],
            ...
        }
    }
}

Warning

Do not forget to write TCP port in the list !!!!

  1. Deploy Metricbeat on all new device first
punchplatform-deployer.sh --deploy -Kk -t metricbeat -l kafka_servers

Check in Grafana System Dashboard the system metrics of the new nodes.

  1. Deploy Kafka on only one new node.
punchplatform-deployer.sh --deploy -Kk -t kafka -l <hostname>
  1. Check the new ID of the new broker in zookeeper by using the zookeeper debug tool
punchplatform-zookeeper-console.sh
ls /<platform_id>/kafka-<kafka_cluster>/brokers

You must have the ID of the new node. If not, please contact the PunchPlatform Support.

  1. Deploy Kafka on all nodes.
punchplatform-deployer.sh --deploy -Kk -t kafka -l kafka_servers

After this step, the kafka cluster has rebooted. So check the current processing of your platform. Especially the writing in this kafka cluster.

  1. Deploy again to check the current deployment.
punchplatform-deployer.sh --deploy -Kk -t kafka -l kafka_servers
  1. Perform last check.

  2. Check the current processing of your platform. No backlog must be observed.

  3. Check the kafka tab in the PunchPlatform Admin Server (port 5000)

  4. End of extension.

Tag the configuration after the operation.

cd $PUNCHPLATFORM_CONF_DIR
git tag -a vX.Y -m "X.Y after extension"