Log Receiver Punchline¶
Refer to the central log management overview
click on this link to go to Central Log Management reference architecture overview
Key Design highlights¶
Transport / interface / HA¶
The logs are received from remote collector sites in lumberjack protocol, in a load-balanced way, through an inbuilt balancing mechanism of the sender site (here a Punch lumberjack output node acting as emitter in a forwarder punchline of the LTR - see LTR Reference architecture
To achieve High Availability, at least two "receiver" punchline are started. Each has a fixed IP address, by running on a fixed node. The way to ensure a fixed location for a punchline is obtained through use of a tagged Shiva runner node (using the tag equals to the hostname). In legacy configurations, a one-node storm 'cluster' could be used instead on each of a servers cluster (then a copy of the punchline is run inside each single-noded 'storm cluster')
With this multiple receivers, we also have scalability of input, because the lumberjack sender (lumberjack output node) is load-balancing over all available lmuberjack receivers, that are therefore combining their throughput capacity (as long as we scale also the target Kafka cluster as needed).
The receiver punchlines are supposed to use very few CPU (therefore as little processing as possible) before writing to Kafka. This allows fast consuming of remote site queued logs, and at the same time allows (in nominal platform condition) to reach as soon as possible a 'safe' disk-persisted state of the events data, including replication (except on single-node deployments). This is most important in case of non-replicated collector sites, or collector sites that do not have much persistence capacity.
Multiple front topics and logs dispatching by device types¶
To allow specialized log processor punchlines, it is often easier to separate different kind of logs in different "input queues" (implemented as different topics in the 'front' kafka cluster).
So one of the processing role of this receiver punchline is to act as a dispatcher of event types towards the different processing queues. To comply with the 'write-fast' dispatching, it is not advised to check each individual incoming documents with many 'patterns' in order to identify the log type; the type identification must rely on an easily available metadata (e.g. the remote IP address of the last log forwarder to the collection site). In this example the hypothesis is that the collection sites have multiple input listening ports, that are used to differentiate the incoming log source types. This way, although all log document have been forwarded to the central site through a single lumberjack port, we can dispatch them to different processing queues based on this 'initial input port' information.
Of course, other means of efficiently identifying log type can exist. (Such as parsing enough of the syslog log envelope to identify the source device IP, then use a configuration table to determine the type of the device).
It is also important that the dispatching mechanism has an 'unknown type' stream (and associated processing queue) to handle unexpected device types that may be added to the events sources, or to handle wrongly formatted input logs, that lead to unrecognizable device type.
Because there is a "punch node" in the punchline (for the dispatching), there is a risk of exception/error in the punchlet code.
This means that some specific processing queue has to be identified to retrieve these unexpected documents, so as to not lose them (so that someone can later identify the problem, and requires changes in source device configuration, or device type discovery mechanisms)
Receiver channel HA configuration example¶
As explained above, the receiving application must be run in multiple instances, located on each of the servers bearing the input addresses of the lumberjack flow. When leveraging shiva placement tags, this leads to this kind of channel_structure pattern: