Skip to content

Elasticsearch Housekeeping

Abstract

This chapter explains how to setup your elasticsearch data lifecycle.

Overview

The retention/resilience requirements applying to Elasticsearch-stored data change as data gets older, and from one subset of data (e.g. one tenant's data) to an other (e.g. a set of specific Elasticsearch indices)

Typical encountered requirements can be :

Quote

I want my Elasticsearch data younger than 3 months to remain available for querying even in case of loss of an Elasticsearch server ; but I do not want this for older data, because it costs (storage and RAM resources consumption overhead by Elasticsearch)

The punch elasticsearch housekeeping service is in charge of cleaning up or moving data around as required.

image

Configuration

The PunchPlatform Elasticsearch-housekeeper service is a small task that is run in the Shiva task scheduler. You can include it in a regular channel. This requires first to add the following job to the channel channel_structure.json descriptor file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
{
    "version" : "5.0",
    "jobs" : [
        {
            "type" : "shiva",
            "name" : "elasticsearch-housekeeping",
            "command" : "elasticsearch-housekeeping",
            "args": [
                "elasticsearch-housekeeping.json"
            ],
            "resources": [
                "elasticsearch-housekeeping.json"
            ],
            "cluster" : "common",
            "shiva_runner_tags" : ["standalone"],
            "quartzcron_schedule" : "0 0 * ? * * *"
        }
    ]
}

Along with an elasticsearch-housekeeping.json file to define your settings. This files contains the following fields:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
    {
        "clusters_settings": [
          {
            "cluster_id": "es_search",
            "actions": [
              {
                "type": "close_indices",
                "indices_prefix": "mytenant-events-",
                "older_than_days": 7,
                "indices_naming_time_format": "%Y.%m.%d"
              },
              {
                "type": "delete_indices",
                "indices_prefix": "mytenant-events-",
                "older_than_days": 12,
                "indices_naming_time_format": "%Y.%m.%d"
              }
            ]
          }
        ]
      }

This sample configuration means that any indices starting with mytenant-events created more than 48 h ago (from now) will be automatically closed ; those older than a week will be deleted.

Tip

Remember that if you update your configuration, you must restart your channel and save your configuration.

Parameters

  • cluster_settings[].cluster_id: String

    • This is the identifier of the elasticsearch, i.e. the key to the Elasticsearch cluster entry in the elasticsearch.clusters section of the [punchplatform.properties] file.
  • cluster_settings[].actions[].type: String

    • This is the kind of action to be conducted. Four values are supported : relocate_indices, change_replica_count, close_indices and delete_indices.
  • cluster_settings[].actions[].indices_prefix: String

    • Optional setting. The action will only apply on indices which names start with this prefix. Please note that the punchplatform Elasticsearch housekeeper will anyway apply actions ONLY to indices which names begin with \- prefix ; this to avoid unwanted actions on other tenants indices. Therefore this setting is useful only if you want to restrict further the target indices (e.g. mytenant-events-mytechnology- )
  • cluster_settings[].actions[].older_than_day: integer

    • Optional setting. The action will only apply on indices created at least this number of days before now. If indices_naming_time_format option is not provided, then this condition applies to actual creation date of an index, regardless of the age of the date stored in the indice.
  • cluster_settings[].actions[].new_replica_count: Integer

    • Mandatory setting (only with type=). A value of 0 means there will be no replica of the primary shards, therefore no resilience to failure of a node bearing a shard of this indice.
  • cluster_settings[].actions[].indices_naming_time_format: String

    • Optional setting. This setting is meaningful only when used in addition to older_than_day option. When this option is present, then the condition on the age of the indice will not be applied to actual creation date of the indice, but on the date implied by the name of the indice. The format of date/time inside the indice naming convention is then provided by this option. This setting must be a valid Python strftime string. It is used to match and extract the timestamp in an index or snapshot name, whatever the place of the date/time string inside the indice name. For more detail on this setting, please refer to Elasticsearch curator tool documentation
  • cluster_settings[].actions[].box_type: String

    • Mandatory setting (only with type=). The value must match one of the values of the box_type tag associated to Elasticsearch nodes in the cluster (See punchplatform.properties, tag property of elasticsearch nodes). e.g. . When this value is changed for a given indice, Elasticsearch will move the indice to cluster nodes that have the same box_type tag value. This is useful for example if your nodes have SSDs and your nodes use magnetic drives for cost reduction.

Try It

On a standalone deployment, the Elasticsearch-housekeeper service is included in the configuration of the admin channel. To activate the Elasticsearch-housekeeper service, you must ensure that the admin channel is started.

1
punchctl --tenant mytenant start --channel admin

Should you want to test it, here is how you can do: First, stop it.

1
punchctl --tenant mytenant stop --channel admin

Create 1 year of indices into your elasticsearch :

1
for i in {0..367} ; do indexName=mytenant-events-$(date --date=@$(expr $(date +%s) - $i '*' 86400) +%Y.%m.%d) ; curl -XPOST localhost:9200/$indexName/doc -d '{"value":42}' -H 'Content-type: application/json' ; echo "" ; done

Change the housekeeping execution frequency in tenants/mytenant/services/admin/service_structure.json, to have an execution each minute :

1
  "quartzcron_schedule" : "0 * * ? * * *"

start the housekeeper :

1
punchctl --tenant mytenant start --channel admin

Wait one minute. Check the state of indices (you should have 7 opened indices, and5 more closed ones with a name starting with mytenant-events)

1
curl localhost:9200/_cat/indices

Production Setup

A typical production requirements set for a tenant is :

  • indices are resilient until they are 3 months old (because we have archiving space to protect the data) ; initial replica count of indices is 1, older indices have replica count of 0
  • indices older than 1 month are closed
  • indices older than 366 days are deleted.

This would translate in the following configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
   {
          "clusters_settings" : [
              {
                  "cluster_id" :"es_search",
                  "actions": [
                      {
                          "type" : "relocate_indices",
                          "indices_prefix" : "mytenant-events-",
                          "older_than_days" : 4,
                          "target_zone" : "warm"
                      },                        {
                          "type" : "change_replica_count",
                          "indices_prefix" : "mytenant-events-",
                          "older_than_days" : 31,
                          "new_replica_count" : 0
                      },
                      {
                          "type" : "close_indices",
                          "indices_prefix": "mytenant-events-",
                          "older_than_days" : 92
                      },
                      {
                          "type" : "delete_indices",
                          "indices_prefix" : "mytenant-",
                          "older_than_days" : 366,
                          "indices_naming_time_format" : "%Y.%m.%d"
                      }

                  ]
              }
          ]
      }

HowTos

In many cases, you may want to prevent the automatic housekeeping (closing, purging...) of some indices, for example because you want to reopen an old indice manually and do not want it to close/delete itself while you are working on it.

Good news : you can inhibit all housekeeping actions on given indices, just by inserting them in a special Elasticsearch alias : -no-housekeeping (e.g. : mytenant-no-housekeeping).

How to inhibit housekeeping actions for an indice

To insert a given indice (here, mytenant-events-2018.03.07), use Elasticsearch alias API to insert this indice in the alias that prevents any action by punchplatform housekeeping service :

POST method on <EsApiNodeURL>/_aliases/<tenant name>-no-housekeeping with a json body content of the form :

1
2
3
4
5
6
7
8
{
    "actions": [
        {"add": {
            "index": "indice name",
            "alias":"<tenant name>-no-housekeeping"
        }}
    ]
}

On a standalone deployment, this will for example be :

1
curl -XPOST localhost:9200/_aliases -d '{"actions":[{"add":{"index":"mytenant-events-2018.03.07","alias":"mytenant-no-housekeeping"}}]}' -H 'Content-Type: application/json'

How to reopen an indice

After preventing housekeeping of your indice as explained before, you can reopen your indice to query it, and it will not be housekeeped :

1
curl -XPOST localhost:9200/mytenant-events-2018.03.07/_open

How to reactivate housekeeping actions for an indice

To remove an indice from the special alias that prevents houskeeping actions, use the following Elasticsearch alias API :

1
curl -XPOST localhost:9200/_aliases -d '{"actions":[{"remove":{"index":"mytenant-events-2018.03.07","alias":"mytenant-no-housekeeping"}}]}' -H 'Content-Type: application/json'

How to know what indices are prevented from housekeeping

To list indices on which the PunchPlatform housekeeping service will conduct no actions, use the following Elasticseach alias API :

GET method on <EsApiNodeURL>/_alias/<tenant name>-no-housekeeping

1
curl -s localhost:9200/_alias/mytenant-no-housekeeping | jq -r keys[]