Elastic Query Stats¶
Compatible Spark
The elastic_query_stats node enable you to get stats about your elasticsearch query and load results in your spark cluster.
Example¶
Basic configuration¶
This configuration will output a dataframe with a single row with multiple columns containing the stats about Elasticsearch query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | { job: [ { type: elastic_query_stats component: input publish: [ { stream: data } ] settings: { index: mytenant-events* nodes: [ localhost ] count_value: true query: { query: { bool: { must: [ { range: { @timestamp: { gte: now-1h lt: now } } } ] } } size: 0 } } } ] } |
To make it clear, here is the corresponding elasticsearch query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | curl -XGET "http://localhost:9200/mytenant-events*/_search" -H 'Content-Type: application/json' -d' { { "query": { "bool": { "must": [ { "range": { "@timestamp": { "gte": "now-1h", "lt": "now" } } } ] } }, "size": 0 } }' |
And instead of returning values for this request it retrieves stats about query : response time, total hits ..
Configuration(s)¶
-
index
: StringDescription: [Required] The name of your elasticsearch index where data will be fetched.
-
port
: IntegerDescription: [Optional] Your Elasticsearch server Port.
-
type
: StringDescription: [Optional] Document type that will be retrieved from your elasticsearch index.
-
query
: JsonDescription: [Optional] A valid Elasticsearch query.
-
nodes
: ListDescription: [Required] Hostnames of your elasticsearch nodes. In general, only one hostname is needed.
-
count_value
: BooleanDescription: [Optional] By default is set to false. Set to true to get the total number of document in the selected index
Warning
If you activate this option, another request will be send to Elasticsearch to have total number of document in index. It could have impact Elasticsearch statistics
Return values :¶
This node output is a single dataset with the following columns :
-
component
:Description: [Always] The component name.
-
count
:Description: [Optional] Total number of document in selected index
-
hits.total
:Description: [Always] Number of document matched by request
-
hits.max_score
:Description: [Always] Elasticsearch score for request
-
query
:Description: [Always] Request sent, useful for debugging
-
timestamp
:Description: [Always] unix timestamp milliseconds
-
took
:Description: [Always] Response time