HOWTO configure kafka retention

Why do that¶

During the deployment of a kafka cluster in a punchplatform, there is several steps.

The first step is to provide sufficient storage.
The second step is to configure the main kafka retention before exactly know the number of each topics (related to technologies collected) and their volume in EPS.
The third step is just before go production. The integrator MUST adapts the kafka retention to size the retention for each topics.

Set the main retention settings¶

These settings are set during the deployment of the PunchPlatform stack through punchplatform-cluster tool. They can be updated in the punchplatform.properties.

"kafka" : {
    "clusters" : {
        "local" : {
            "partition_retention_bytes" : 1073741824,
            "partition_retention_hours" : 24 
        }
    }
}

Then, update the platform through punchplatform-cluster tool.

punchplatform-deployer.sh --deploy -Kk -t kafka

Set the specific retention settings for each topic¶

First, check the number of topics and the partition number.

# from the PunchPlatform operator environment
punchplatform-kafka-topic.sh --describe

Second, fill the following Excel file.

Sizing File

Then, update the configuration in production for each topic with settings show in the red square.

# from the PunchPlatform operator environment
punchplatform-kafka-topics.sh --kafkaCluster <cluster> --topic <topic_name> --add-config retention.bytes=NNNNNNNNNNN

!!! warning "The retention.byte of the topic must be lower than the main settings. So the main settings must be the max of the retention byte of each topic."