HTTP Spout

The HTTP spout is similar to the TCP Syslog spout, but instead of bytes, http requests bodies are the base unit. The http spout reads logs for the http stream and inject them in the topology.

Here is a complete configuration example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
 {
     "type": "http_spout",
     "spout_settings": {
         "listen": {
             "host": "0.0.0.0",
             "port": "9999",
             "compression": true
         }
     },
     "storm_settings": {
         ...
     }
 }

Compression

The HTTP supports two compression mode. If you use the compression property, compression will be performed at the socket level using the Netty ZLib compression. If instead you use the http_compression parameter, compression is performed as part of HTTP frame.

Note

Netty compression is most efficient, but will work only if the peer is a PunchPlatform HTTP spout. If you send your data to a standard HTTP server such as a Logstash daemon, use http compression instead.

Streams And fields

The HTTP spout emits in the topology a tuple with one or up to 7 fields. One of the field contains the input line, as read on the socket. You can name that field the way you want. The other fields are optional and used to vehiculate (respectively) the remote (local) socket IP address, the remote (local) socket port number, the local timestamp (settled at reception) and a unique id. This is summarized by the next illustration:

../../../../_images/HttpSpoutContract.png
Field Type Description
log String the json document, received by the HttpSpout as a body request
http_uri String HTTP path uri (e.g. ‘/path/to/resource?q=a&bool=true’)
http_user_agent String HTTP user-agent (e.g. ‘curl/7.54.0’)
http_content_type String HTTP content-type (e.g. ‘application/x-www-form-urlencoded’)
local_uuid String a unique log id
local_host String the local host
local_port int the local port
remote_host String the remote host
remote_port int the remote port
local_timestamp int the local timestamp (settled by the HttpSpout when it received the log)

Metrics

See Http Spout Metrics