GeneratorSpout¶
The GeneratorSpout simply publishes fake data. It can be used to play unit tests or help you design some topologies.
Here is a complete configuration example.
1 2 3 4 5 6 7 8 9 10 11 | { "type": "generator_spout", "spout_settings": { "messages": [ "my first log", "My second message log", "And finally a third one" ] }, "storm_settings": {...} } |
Complete examples¶
The Generator Spout can work in two different ways:
- Publish on a unique stream and field
- Publish on various streams and fields
In this first case, will contains an array of text logs. Each message will be send to the stream .
Here is a complete example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | { "type": "generator_spout", "spout_settings": { "messages": [ "my first log", "My second message log", "And finally a third one" ] }, "storm_settings": { "component": "generator", "publish": [ { "stream": "logs", "fields": [ "log" ] } ] } } |
In the second case, the message
property will be an array
of JSON. Note that no publish
key has been defined in the
storm_settings
section. Instead, each message completely
defines its stream and fields.
Here is a complete example:
1 2 3 4 5 6 7 8 9 10 11 12 13 | { "type": "generator_spout", "spout_settings": { "messages": [ { "logs": { "log": "my first log" }}, { "logs": { "foo1": "bar", "foo2": "baar", "foo3": "baaar" }}, { "other": { "log": "Here I am on another stream!" }} ] }, "storm_settings": { "component": "generator" } } |
Load generation¶
If you need a lot of messages, but do not want to copy-paste thousands of lines, you can use "messages_count" settings to indicate a total number of messages to be generated.
The provided "messages" list will be reused again and again until the wanted number of messages are emitted.
By default, a 1s interval is waited between each message. If you want faster emission, use 'interval" setting, provided the (approximated) number of milliseconds to wait between two messages generation. A value of 0 will provide best speed the spout can achieve.
If you need some "variation" between each message generated from the "messages" fixed list, you can include the
%{message_num}
special tag inside your messages strings. This will be replaced by message number (starting at 1).
This is an example of load-generator topology sending a million log documents with different document ids and contents (here for loading an Elasticsearch):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | { "tenant": "validation-kafka", "channel": "kafka", "name": "single", "spouts": [ { "type": "generator_spout", "spout_settings": { "messages_count" : 1000000, "interval" : 0, "messages": [ { "logs": { "log": "## LOG %{message_num} ##", "_ppf_id": "msg-%{message_num}" } } ] }, "storm_settings": { "component": "generator" } } ], "bolts": [ { "type": "elasticsearch_bolt", "bolt_settings": { "cluster_id": "es_search", "reindex_failed_documents" : true, "error_index" : { "type" : "daily", "prefix" : "mytenant-events-indexation-errors-" }, "per_stream_settings": [ { "stream": "logs", "index": { "type": "daily", "prefix": "mytenant-events-" }, "document_value_fields": ["log"], "document_id_field" : "_ppf_id", "additional_document_value_fields": [ { "type": "date", "document_field": "@timestamp", "format": "iso" } ] } ] }, "storm_settings": { "component": "elasticsearch_bolt", "subscribe": [ { "component": "generator", "stream": "logs" } ] } } ], "storm_settings" : { "topology.worker.childopts" : "-Xmx1G -Xms1G", "xtopology.max.spout.pending" : 30000 } } |
Parameters¶
interval
: Number 1000
OPTIONAL: Interval of time in milliseconds between the sending of each message. Its default value is set to 1 second.
messages_count
: Number
OPTIONAL: If you want to generate a big number of messages, you can provide "messages_count" setting, and the generator will send the messages multiple times until the required messages count is reached. Its default value is equal to number of
messages
.