Skip to content



punchplatform-log-injector -c $PUNCHPLATFORM_CONF_DIR/resources/injector -n 100 -t 50

DESCRIPTION is a fully configurable log injector. You can use it to inject test messages into any message processing platform. To do that you define injection json configuration files containing the load characteristics, the log message format, and the destination input point.

The injector is capable of writing to sockets (udp, tcp), to Kafka, to Elasticsearch, to Clickhouse and to lumberjack endpoints.

The injector can also play a server role. You can use it to bench your networking plane, or to test a punchplatform. It can read from tcp, udp, lumberjack and Kafka.


  • -brokers <arg> :

    • the kafka broker where to read from.
  • -c <campaign.json> | -c <campaign1.json>,<campaign2.json> | -c <campaign directory> :

    • Run a single or several campaigns, or all campaigns found in the specified directory. If you run several campaigns, each will be run using a dedicated thread.
  • -check,--dump :

    • dump to stdout instead of injecting.
  • -cl | --lumberjack-client :

    • Starts running as a Lumberjack client. You must set the server host and port number using the host and port option. By default the lumberjack client will send 32 bytes strings. If you want to simulate real traffic you must define a configuration injection file.
  • -d | --delay <value_in_ms> :

    • overrides injection file throughput or inter message delay, in milliseconds. This can be used to inject very low traffic rates.
  • -earliest :

    • start consuming kafka message from the earliest.
  • -h,--help :

    • print this message.
  • -H,--host <arg> :

    • overrides injection file destination host.
  • -k | --kafka-consumer :

    • Starts running as Kafka consumer. You must set the topic and broker options.
  • -latest :

    • start consuming kafka message from the latest.
  • -n | --number <message_number> :

    • Exits after that many messages have been sent.
  • -p | --port <rate> :

    • Set the (destination or listening) port number. This overload the rate defined in the configuration file, if any.
  • -punchlets,--punchlets <arg> :

    • stress a chain of punchlets (comma separated).
  • -resources,--resources <arg> :

    • add punchlet resources (comma separated).
  • -q,--silent :

    • reduce verbosity to error messages.
  • -sl | --lumberjack-server :

    • Starts running as a Lumberjack server. You must set the port number using the port option.
  • -st | --tcp-server :

    • Starts running as a plain TCP server. You must set the port option.
  • -stream :

    • Define storm stream for injected logs
  • -t | --throughput <rate> :

    • Define the traffic rate, possibly overloading the rate defined in the configuration file, if any.
  • --thread <thread-number> :

    • By default each injection is singled thread. To simulate several connections to the server, increase the number of threads. Each will take a part of the total throughput defined in your scenario.
  • -it, --inactivity-timeout <timeout string> :

    • Inactivity duration before exiting the injector. Default to infinity.
  • -topic <arg> :

    • the kafka topic.
  • -ts,--tcp-server <arg> :

    • act as tcp server to count the number of received logs. You must set the port number.
  • -u,--udp:

    • use udp.
  • -us,--udp-server <arg>:

    • act as udp server to count the number of received logs. You must set the port number.
  • -v,--verbose :

    • prints out the read data. It only work with some sender or receiver.
  • -w,--connection-timeout <arg> :

    • defines maximum wait time in ms for the receiver port to be available (not in udp mode) - 0 (default value) means infinite wait. Also applies on reconnection after connection loss.
  • --sustain :

    • This option is relevant only for the lumberjack client. The client will send increasing traffic to the server, and will stop when the window of unacknowledged messages reaches 1000. This lets you easily check the bandwidth of your system.
  • -lj,--lumberjack-json-fields-payload :

    • Instead of the payload message being injected as a string in the 'log' field, the payload message provided is expected to be a json string, that defines the root fields and values in the lumberjack frame. This allow using an other field than 'log', or to provide multi-fielded lumerjack frames.
  • -cp, --compression:

    • Enable compression for Lumberjack protocol (option valid for Lumberjack server and client).
  • --ssl_private_key :

    • This option is relevant only for the lumberjack client and server. Specify a private key path.
  • --ssl_certificate :

    • This option is relevant only for the lumberjack client and server. Specify a certificate key path.
  • --ssl_protocol :

    • This option is relevant only for the lumberjack client and server. Specify SSL protocol between TLSv1.2 (default), TLSv1.1, TLSv1.0.
  • --ssl_provider :

    • This option is relevant only for the lumberjack client. Specify a SSL provider between JDK (default), OPENSSL, OPENSSL_REFCNT.
  • --ssl_ciphers :

    • This option is relevant only for the lumberjack client. Specify and overrides SSL ciphers, use provider ciphers suit by default. Specify a comma separated list for custom ciphers, for example: TLS_DHE_RSA_WITH_AES_256_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA


Inject apache traffic . The destination and load characteristics is defined in an injection file. -c apache_injection.json

Idem but changing the rate to 1500 messages per seconds on stream [logs][log]: -c apache_injection.json --throughput 1500 --stream [logs][log]

Running a lumberjack server listening on tcp/lumberjack port 21212 --lumberjack-server --port 21212 --compression

Sending lumberjack traffic to the server we just started -c lumberjack_injector.json -t 1000 --compression

Same with SSL --lumberjack-server --port 21212 --compression \
  --ssl_private_key conf/resources/ssl/server-key.pem \
  --ssl_certificate conf/resources/ssl/server-cert.pem -c lumberjack_injector.json -t 1000 --compression \
  --ssl_private_key conf/resources/ssl/server-key.pem \
  --ssl_certificate conf/resources/ssl/server-cert.pem
-  You have to precise the protocol in the injector configuration file

adding lumberjack for that field. In you need an example just check in the $PUNCHPLATFORM_CONF_DIR/conf/resources/injector/examples/ repository Take care, you can't add a --punchlets parameter to the lumberjack, in that case it will only send the data to the punchlet without taking the protocol into account.

Checking the traffic received from a Kafka topic. Note here that the 'local' kafka broker must be defined. in your file. I.e. this option only works on an installed punchplatform node. --kafka-server --topic apache --brokers local -v

Test or stress a punchlets pipeline to see overall performance -c [injection_file].json --punchlets p1.punch,p2.punch,... --resources r1.json,r2.json,...

Sending lumberjack traffic to the server we just started with custom SSL configuration -c lumberjack_injector.json -t 1000 \
--ssl_private_key resources/ssl/certs/gateway/gateway-super-1-key-pkcs8.pem \
--ssl_certificate resources/ssl/certs/gateway/gateway-super-1-cert.pem \
--ssl_protocol TLSv1.2 \
--ssl_provider JDK \

Configuration file

The injection file is a JSON file, in which you are free to add \'#\' prefixed comments.

The various sections are described below.

Destination Section

You can set in your injection file the data destination, i.e. where you want to send your generated data. This section is optional, you can define it using the command-line parameters. If you set one, you can also override it using online parameters. Here is an example to send your generated data to a TCP server.

    "destination" : { 
        "proto" : "tcp", 
        "host" : "", 
        "port" : 9999 

The supported destination are :

  • tcp: send the data to TCP server
  • udp: send the data to UDP server
  • lumberjack: send the data to UDP server
  • http: performs POST REST requests to an http server
  • stdout: just print out the generated data. Use for debugging purposes and copypasta.
  • kafka: act as a kafka producer, toward a given topic
  • elasticsearch: inject data directly to an Elasticsearch cluster.

Here are examples configurations for the \"destination\" section:

# all these require plain host port parameters
{ "proto" : "tcp", "host" : "", "port" : 9999 }
{ "proto" : "udp", "host" : "", "port" : 9999 }
{ "proto" : "lumberjack", "host" : "", "port" : 9999, "compression": false }
{ "proto" : "http", "host" : "", "port" : 9999, 
    "http_method": "POST", 
    "http_root_url": "/", 
    "bulk_size": 1 

# Elasticsearch configuration ('port' is optional, 'bulk_size' default is 1)
    "proto" : "elasticsearch", "host": "", "port": 9300, 
    "cluster_name" : "es_search", 
    "index": "test", 
    "type": "doc", 
    "bulk_size": 1000 }

# Kafka only accepts a "brokers" name that must be defined in your 
# file. That is : this option only works
# (as of today) on an installed punchplatform.
    "proto": "kafka", 
    "brokers": "local", 
    "topic": "mytenant_bluecoat_proxysg"

Load Section

This section lets you control the injector\'s throughput. It is also optional if you prefer using online parameters.

"load" :{

    # "message_throughput" indicates the number of message per second.
    # Sometimes you want to inject fewer message than 1 per second, 
    # you can then use the alternative property : "inter_message_delay" 
    # For example to inject one message every 30 seconds :
    #   "inter_message_delay" : 30
    "message_throughput" : 1000,

    # Optional : control how often you have a recap stdout message. 
    "stats_publish_interval" : "2s",

    # The total number of message. Use -1 for almost infinite (2³¹-1 messages). 
    "total_messages" : 1000000,

    # Optional : make you throughput fixed or variable. By default fixed.
    # Using "variable" makes your load vary between 50 and 150 % of your 
    # set throughput.
    "type" : "fixed"

Punchlets Performance Test

The injector is great to stress one or a chain of punchlet under a high load of data. Using the --punchlets argument you basically make a chain a punchlets traversed by tons of (representative) data.

To check everything runs fine before stressing the punchlets, use the "-v" option to dump the punchlet result Again the -t option is your friend here to do that slowly -c <json-injection-file> --punchlets punchlet1,punchlet2,.. -t 1 -v

If you need to include punchlet resources, use --resources option -c <json-injection-file> \
    --punchlets standard/common/input.punch,standard/common/parsing_syslog_header.punch,... \
    --resources standard/apache_httpd/taxonomy.json,standard/apache_httpd/http_codes.json \
  • Note on punchlet performance : you should expect on a Intel Core i7 2,5GHz:
    • running the injection without doing nothing : 730 Keps
    • running the injection with the input tuple creation only : 670 Keps
    • running the injection with the punchlets : 30 Keps

Message Section

This mandatory section contains the payload sent by the log injector.

"message" : {

    # the payloads are templates of what you inject. In there you 
    # can insert %{} variable fields that will be replaced by the 
    # corresponding element you define in the "fields"section 
    # described right next. For example here, %{src} will be replaced 
    # by the "src" field.
    # You can define a single payload. Should you define several 
    # ones like illustrated here, the injector will simply round-robin 
    # on each one.
    # Every time a message is generated, each %{} variable field is 
    # replaced by a new value.
    # You can thus finely control what your output data will look like.

    "payloads" : [
        "%{timestamp}: New session from IP %{src_ip} UUID %{uuid}.",
        "%{timestamp}: %{owner} visited URL %{url} %{nb_visits} times.",
        "%{timestamp}: %{owner} also uploaded %{outbytes}kb and downloaded %{inbytes}kb."

    # The fields sections lets you define various kind of generated 
    # values. In the following all the supported injector fields are 
    # described.

    "fields" : {

        "src_ip" : {
            # Generate IPV4 addresses. 
            "type" : "ipv4",

            # You use brackets to control what part of the address 
            # you want to make variable. Here all of them. 
            "format" : "[0-255].[0-255].[0-255].[0-255]"
        "url" : {

            # Take the values from a list. Every time a value is 
            # generated you getn next element of your list.
            "type" : "list",

            # Here is your list. 
            "content" : [
                "GET /ref/index.html HTTP/1.1", 
                "GET /yet/another.html.css HTTP/1.1"

        "owner" : {
            "type" : "list",
            "content" : ["frank", "bob", "alice", "ted", 
                            "dimi", "ced", "phil", "julien"]

            # This time we want to iterate differently. We want to 
            # send "frank" 3 times then "bob" 3 times and so on. 
            "update_every_loop": false,
            "update_every": 3

        "uuid": {
            # Generate a valid unique string identifier
            "type": "session_id"

        "nb_visits" : {
            "type" : "counter",
            "min" : 0,
            "max" : 12
        "inbytes" : {
            "type" : "random",
            "min" : 1000,
            "max" : 30000
        "outbytes" : {
            "type" : "gaussian",
            "mean": 200.0,
            "deviation" : 30.0,
            "mantissa_precision": 2,
            "always_positive": true
        "timestamp" : {
            "type" : "timestamp",
            "format" :  "dd/MMM/yyyy:HH:mm Z",
            "start_time" : "2012.12.31",
            "start_time_format" : "yyyy.MM.dd",
            "tick_interval" : "1h"

In many case you want to send json payloads. You can use embedded Json to make it easier. An example explains it all:

"message" : {

    "payloads" : [
            "time" : "%{timestamp}", 
            "aNumber" : %{number} 


the resulting file is not a valid Json anymore because the %{number} would require to be enclosed by quotes. The log injectors will deal with it, but that suppose you generate a numerical or boolean value..

Here are the several supported templated types:

  • ipv4 : to generate ipv4 addresses
  • list : to loop over a set of items
  • counter : an iterating numeric value
  • random : a random numeric value following uniform probability density.
  • gaussian : a random value following a gaussian probability density.
  • session_id : Generates an UUID
  • timestamp : a timestamp, for which you fully control the format, the start time, and the tick interval. You can also refer the start time of one timestamp to another.

Loop control

Whatever be the type you can control the value generation using the following optional parameter:

  • update_every_loop : boolean
    • control the way the field is updated, either each time or one out of update_every loop. Note that if set to false, the update_every parameter is mandatory.
    • default: true
  • update_every : int
    • the number of loop iterations before the generated value is changed.
    • default: 1


Using the sessions_idtype you can generate short yet unique string id. These ids are similar to youtube or elasticsearch ids. An example value is 'VVkgmncB4VD0JSomcivo'.

    "uniquecarrier_id": {
        "type": "session_id"


Using the uuidtype you can generate standard uuid.

    "uniquecarrier_id": {
        "type": "uuid"


  • content : boolean
    • an array of values the injector will loop over.
    • example: [ 1, 2, 3 ], [ "hello", "world" ]


  • min : the (inclusive) min value
  • max : the (inclusive) max value


  • min : the (exclusive) min value
  • max : the (exclusive) max value


  • mean : int

    • the average value of the repartition.
    • default: 0
  • deviation : int

    • the standard deviation. Note: this means that 68% of the values will be contained in [mean]{.title-ref}+`deviation`
    • default: 1
  • mantissa_precision : int

    • number of digits after the comma. If set to 0, the comma char \'.\' itself is removed (integer).
    • default: 0
  • always_positive : boolean

    • only generate positive values. Note that the gaussian is cropped also in 2*AVERAGE to keep the mean value intact
    • default: true


Using the timestamp field you can generate time at the format you need. Here is a simple explicit example:

            "departure_timestamp" : {
                "type" : "timestamp",
                "format" :  "dd/MMM/yyyy:HH:mm Z",
                "start_time" : "2012.12.31",
                "start_time_format" : "yyyy.MM.dd",
                "tick_interval" : "1h"
That will produce :
departure_timestamp=31/Dec/2012:01:00 +0100
departure_timestamp=31/Dec/2012:02:00 +0100

Controlling the timstamp to make it start from another one is handy to generate timestamps that represent a time interval. Say you wand to add an "arrival" timestamp based on your departure timestamp plus a random value expressed in hours. Here is how you do it.

            "departure_timestamp" : {
                "type" : "timestamp",
                "format" :  "dd/MMM/yyyy:HH:mm Z",
                "start_time" : "2012.12.31",
                "start_time_format" : "yyyy.MM.dd",
                "tick_interval" : "1h"
            "arrival_timestamp" : {
                "type" : "timestamp",
                "format" :  "dd/MMM/yyyy:HH:mm Z",
                "relative_start_time" : "dep_timestamp",
                "duration" : {
                    "type" : "random",
                    "unit" : "minute",
                    "min" : 120,
                    "max" : 360

You will get:

departure_timestamp=31/Dec/2012:06:00 +0100 arrival_timestamp=02/Jan/2013:14:23 +0100
departure_timestamp=31/Dec/2012:07:00 +0100 arrival_timestamp=31/Dec/2012:14:30 +0100
departure_timestamp=31/Dec/2012:08:00 +0100 arrival_timestamp=03/Jan/2013:14:38 +0100

Return codes

The punchplatform-log-injector utility exits 0 on success, and >0 if an error occurs.


The following environment variables affect the execution of

    • The PUNCHPLATFORM_CONF_DIR_CONFDIR environment variable indicate the directory where tenant and channel configuration files are stored. A 'tenants' subdirectory is expected. Below you will find a tenant then channel directory tree.


No known bugs.