HOWTO extract logs from elasticsearch with logger
Why do that¶
To extract a small number of logs, for instance the result of an investigation, the punchplatform kibana plugin is better. The following method is to extract a large amount of logs on a continuous period of time with a simple query.
If you have access to PML, you should use it instead.
What to do¶
Select your data¶
To extract data from elasticsearch, we first build a topology that prints the result of the query.
{
"tenant": "mytenant",
"channel": "extractor",
"name": "extractor",
"dag": [
{
"type": "extraction_input",
"settings": {
"index": "*metricbeat*/doc",
"query": "?q=beat.version:6*"
},
"storm_settings": {
"executors": 1,
"component": "extractor_spout",
"publish": [
{
"stream": "default",
"fields": [
"doc"
]
}
]
}
}
],
"bolts": [
{
"type": "punchlet_node",
"settings": {
"punchlet_code": "{print(root:[default][doc].toJson());}"
},
"storm_settings": {
"executors": 1,
"component": punchlet_node",
"subscribe": [
{
"component": "extractor_spout",
"stream": "default"
}
]
}
}
]
}
Run the topology with the command:
punchlinectl <your_topology>.json
If you don't see any data on the terminal, recheck the elasticseach_spout
settings to
ensure that at least one documents is found.
Then, update the punchlet_code
field with the following settings:
"punchlet_code" : "{ logger().warn(root:[default][doc].toJson()); }"
Update the logger configuration¶
We recommend to start the extraction in foreground.
Backup the current log4j2-topology.xml
file located in the
operator library folder. Depending on the platform installation, it
can be found at:
- standalone:
<install_dir>/external/punch-operator-*/bin/log4j2-topology.xml
- deployed:
/data/opt/punch-operator-*/bin/log4j2-topology.xml
Now, update the previous log4j2-topology.xml
file with the following settings:
fileName
: path to store extracted logsfilePattern
: name and pattern of archives that contains extracted logs.
Here is an example of original file only update with the new parameters. Here, the ...
must
be replaced with the original file content, we only removed them to be clearer.
<?xml version="1.0" encoding="UTF-8"?>
<Configuration monitorInterval="10" shutdownHook="disable">
<properties>
<property name="patternPunchlet">%msg%n</property>
...
</properties>
<Appenders>
<RollingFile name="PUNCHLETLOGGER"
fileName="${sys:punchplatform.log.dir}/extraction/${sys:logfile.name}.json"
filePattern="${sys:punchplatform.log.dir}/extraction/${sys:logfile.name}.json.%i.gz">
<PatternLayout>
<pattern>${patternPunchlet}</pattern>
</PatternLayout>
<Policies>
<SizeBasedTriggeringPolicy size="100 MB"/>
</Policies>
<DefaultRolloverStrategy max="1000000"/>
</RollingFile>
...
</Appenders>
<Loggers>
<logger name="org.thales.punch.libraries.punchlang.api.Punchlet" level="warn" additivity="false">
<appender-ref ref="PUNCHLETLOGGER"/>
</logger>
...
</Loggers>
</Configuration>
Finally, run the extraction¶
punchlinectl <topology_name>.json
Where are the output file ?¶
For example, if you use the default parameter set in the log4j2-topology.xml
above, you have
to find the punchplatform.log.dir
. To do so, run this command:
punchplatform-env.sh | grep PUNCHPLATFORM_LOG_DIR
From this folder, you will find your extraction files under the extraction
directory. In each
files, you finally get one log event per line.