Skip to content

Troubleshooting Archiving topology logs

What to do

Sometimes, to understand what is going on with your Archiving topology may be difficult and time-consuming. A quick and efficient way to see what happens is to increase the related loggers' verbosity.

How to do it

First, find the path of your topology's logger configuration. It should shows up with a simple ps aux | grep <topology name>.

Here is an output example:

ps aux | grep archiving_topology
punch     4997  109  2.6 4009120 429748 pts/2  Sl   16:18   0:18 /home/punch/.jenv/versions/1.8/bin/java
[...]
-Dlogfile.name=worker.log
-Dstorm.log.dir=/home/punch/punchplatform/standalone/punch-standalone-6.0.0/external/apache-storm-1.2.2/logs
-Dlog4j.configurationFile=/home/punch/punchplatform/standalone/punch-standalone-6.0.0/external/apache-storm-1.2.2/log4j2/worker.xml
[...]
org.apache.storm.daemon.worker mytenant_apache_httpd_archiving_topology-1-1571062699 b003941a-0463-4f87-968f-6df83a2758f6 6720 4cc0febe-8972-47d8-badc-4f655c1f8399

Here, the configuration file we want to update /home/punch/punchplatform/standalone/punch-standalone-6.0.0/external/apache-storm-1.2.2/log4j2/worker.xml.

Add these 2 lines in the <loggers> XML section:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration monitorInterval="10" shutdownHook="disable">
  <properties>
    ...
  </properties>
  <Appenders>
    ...
  </Appenders>
  <Loggers>
    ...
    <logger name="org.thales.punch.objects" level="TRACE"/>
    <logger name="org.thales.punch.libraries.storm.bolt" level="TRACE"/>
    ...
  </Loggers>
</Configuration>

As you can see in the configuration tag attribute, the monitorInterval parameter set to "10". It means that this configuration file will be automatically updated by the logger itself every 10 seconds, so you do NOT need to restart the topology!

Now, you should see [TRACE] or [DEBUG] level logs like the examples below:

16:17:12 [DEBUG] message="object storage destination" destination="file:///tmp/archive-logs/storage"
16:17:13 [TRACE] message="processing filebolt batch" batch_id="1571062633316" batch_size=100
16:17:13 [DEBUG] message="indexing batch metadata into Elasticsearch" pool_name="apache-httpd-archiving" object_id="httpd/0/2019.10.14/httpd-0-1571062633316" doc="{"batch_id":"httpd/0/2019.10.14/httpd-0-1571062633316","nb_of_tuples":100,"earliest_tuple_datetime":"2019-10-14T15:47:11.652+02:00","latest_tuple_datetime":"2019-10-14T15:47:13.38+02:00","content_size_bytes":137069,"uncompressed_content_size_bytes":137069,"compression_rate":0.0,"cluster_addresses":["file:///tmp/archive-logs/storage"],"pool_name":"apache-httpd-archiving","topic_name":"httpd","partition_id":0,"batch_num":1571062633316,"tuples_encoding_format":"TEXT","compression_format":"NONE","fields":["_ppf_id","_ppf_timestamp","log"],"enciphering_header":null,"enciphering":false,"fields_separator":"__|__","content_has_header":true,"bloom_filter":"","channel":"apache_httpd","tenant":"mytenant","archiving_ts":1571062633997,"@timestamp":"2019-10-14T14:17:13.997Z"}"