HOWTO elasticsearch REST API Tips and tricks

Why do that¶

To properly administrate Elasticsearch cluster, you sometimes need to control its health more precisely via its REST API. This is just an abstract here, the full documentation is available in the Elasticsearch website

Prerequisites¶

Having a working Elasticsearch cluster in your PunchPlatform is a good start.
Being logged in in the PunchPlatform Admin server, or anywhere else where you can reach at least one node on port 9200/TCP.

curl -XGET "http://<ES_NODE>:9200" # should yield at least the Elasticsearch version

Tips and tricks¶

Basic commands¶

Getting REST main GET args:

curl -XGET "http://<ES_NODE>:9200/_cluster/health?pretty"           # pretty option is printing JSON responses in several lines
curl -XGET "http://<ES_NODE>:9200/_nodes/stats?pretty"           # pretty option is printing JSON responses in several lines
curl -XGET "http://<ES_NODE>:9200/_cluster/stats?pretty"           # pretty option is printing JSON responses in several lines
curl -XGET "http://<ES_NODE>:9200/_cat/nodes?h=ip,master"  # h is to specify which column you want ES to yield.
curl -XGET "http://<ES_NODE>:9200/_cat/nodes?help"         # help is giving you which colums you can enter
curl -XGET "http://<ES_NODE>:9200/myevents/_search?pretty" # pretty option is printing JSON responses in several lines

Getting overall health and status:

curl -XGET "http://<ES_NODE>:9200/_cat/health?v" # Health of the cluster
curl -XGET "http://<ES_NODE>:9200/_cat/nodes?h=name,load,master,ip&v"| sort # Health of the cluster's nodes
curl -XGET "http://<ES_NODE>:9200/_cat/indices" # Health of the cluster's indexes
curl -XGET "http://<ES_NODE>:9200/_cat/shards?h=index,sharp,prirep,state,docs,ur,ua" # Health of the indexes' shards. Grep in it to find information
curl -XGET "http://<ES_NODE>:9200/_cat/shards" # Health of the indexes' shards. Grep in it to find information
curl -XGET "http://<ES_NODE>:9200/_cat/recovery?h=index,shard,stage,shost,thost,bp,tb" # state of yellow indexes recoveries.

Making a query
Check for two logs matching target IP as 2.2.22.222 for vendor Arkoon (Query String : limited to 15000 chars):

curl -XGET "http://<ES_NODE>:9200/<my_index_pattern>/_search?size=2&q=init.host.ip:2.2.22.222%AND%vendor:arkoon"1

- Check for several structured args (no size limit)

curl -XGET "http://<ES_NODE>:9200/_search" -d'
  "query": {
       "bool": {
           "must": [
               { 
                 "match": {
                   "target.host.ip": "2.2.22.222"
                 }
               },
               { "match": { "vendor": "arkoon"  } }
           ],
           "filter": [
              { "term":  { "action": "DROP"  } },
              { "range": { "publish_date": { "gte": "2015-01-01"  } } }
           ]
       }
   }'

Accelerating return to GREEN state from YELLOW¶

This trick is usually needed to perform operations in a cluster, which is expected to be green. However, there is no magic, the cluster goes as fast as it can to handle recovery. Each and every operation indices an operational risk.

At first, you can increase/decrease the recovery rate (RISK: the recovery can take much more time if decreased ; or the overall load can get really high if increasing, causing insertion decreases, master overload):

curl -XPUT "http://<ES_NODE>:9200/_cluster/settings" -d'
  {
    "transient": {
        "indices.recovery.max_bytes_per_sec": "5mb"
    }
  }'

Then you can specify if some indexes can be omitted for recovery, by changing replication number (RISK : any shard failure will result of definitive data corruption)

curl -XPUT "http://<ES_NODE>:9200/<my-index-pattern>/_settings" -d'
  {
    "transient": {
      "number_of_replicas" : 2
    }
  }'