For advanced users, it can be useful to precisely select outputs from Storm topologies to keep the most meaningful information without saturating disk storage because of trace logs. We will see how topologies logging levels can be tuned.
On a standalone setup, to easily understand what is happening, launch a channel (Apache in our example)
$ punchplatform-channel.sh --start mytenant/apache_httpd
Now, by running
ps aux | grep apache_httpd, you see that one of the JVM options is
This way, you can retreive the current configuration file used by your topology. By default, this file is named “worker.xml”. The
is used by Storm Log4j module to configure log outputs. By updating this file, we can increase or decrease verbosity. On a standalone setup, this file
is located at
On a cluster setup, the
worker.xml location depends on your
setups_root variable set in
our case, setups_root is
/data/opt so the resulting location is
On a single node setup, only one file exists, the only host one. But on a multinode installation, the “worker.xml” is defined by Storm worker so it exists on each nodes of the cluster. If you decided to update one of them, in order to preserve consistency you should do it on each one.
In production, you should NOT modify it ! Because this file is used by every worker, you probably do not want to alter your working configuration on the fly. For debug tasks, see this troubleshooting guide.
worker.xml file looks like ?
<configuration monitorInterval="60" shutdownHook="disable"> <properties> ... </properties> <appenders> ... </appenders> <loggers> <root level="info"> <appender-ref ref="A1"/> <appender-ref ref="syslog"/> </root> <Logger name="org.apache.storm.metric.LoggingMetricsConsumer" level="info" additivity="false"> <appender-ref ref="METRICS"/> </Logger> <Logger name="STDERR" level="INFO"> <appender-ref ref="STDERR"/> <appender-ref ref="syslog"/> </Logger> <Logger name="STDOUT" level="INFO"> <appender-ref ref="STDOUT"/> <appender-ref ref="syslog"/> </Logger> </loggers> </configuration>
At the beginning, the
configuration tag has the
monitorInterval attribute set to 60. Which means this configuration file will be reloaded every 60 seconds if a modification is made.
Then, for each logger, the level can be set to these values (by increasing verbosity): FATAL, WARN, INFO, DEBUG, TRACE
The new configuration will be applied to ALL existing topologies. It is NOT possible to modify log levels for only one topology.
Note that, from “FATAL” to “DEBUG”, the verbosity remains acceptable even in a production setup. Beware of the “TRACE” level in production.
For example, to output absolutely everything, replace the
loggers tag this way (strongly discouraged, example purpose only):
<loggers> <root level="TRACE"/> </loggers>
If you want to be more precise and change only some loggers’s level, see the following documentation extract
<loggers> <!-- We rely on a generic kafka consumer, itself used by the kafka bolt. Set it to DEBUG for tracing IO bulk operations. This will not flood your logs because the consumer read many kafka messages at once. Set it to TRACE for more details. --> <logger name="com.thales.services.cloudomc.punchplatform.kafka.consumer" level="WARN"/> <!-- Bolts or spouts. --> <logger name="com.thales.services.cloudomc.punchplatform.storm.bolt" level="WARN"/> <logger name="com.thales.services.cloudomc.punchplatform.storm.spout" level="WARN"/> <!-- These one are handy to have an eye on your socket traffic if you use syslog, udp tcp or lumberjack. The MonitoringHandler one dumps the complete traffic. --> <logger name="com.thales.services.cloudomc.punchplatform.commons.netty" level="WARN"/> <logger name="com.thales.services.cloudomc.punchplatform.commons.netty.impl.MonitoringHandler" level="WARN"/> <!-- Use these if you write or work with Punch scripts. Watch out the DEBUG can be verbose. --> <logger name="com.thales.services.cloudomc.punchplatform.punch" level="WARN"/> <logger name="com.thales.services.cloudomc.punchplatform.punch.compile" level="WARN"/> <!-- These ones are useful if you struggle running a punchlet. Useful also for topologies, since, of course, your punchlet will be executed. --> <logger name="com.thales.services.cloudomc.punchplatform.punch.runtime" level="WARN"/> <logger name="com.thales.services.cloudomc.punchplatform.punch.runtime.operator" level="WARN"/> <!-- Use these if you have Objects storage problems. --> <logger name="com.thales.services.cloudomc.punchplatform.objectspool" level="WARN"/> <logger name="com.thales.services.cloudomc.punchplatform.ceph.client" level="WARN"/> <!-- zookeeper and storm are verbose. --> <logger name="org.apache.zookeeper" level="WARN"/> <logger name="org.apache.storm" level="WARN"/> <!-- These are useful if you struggle having the Elasticsearch bolt connect to the target elasticsearch cluster. --> <logger name="org.elasticsearch.cluster" level="WARN"/> <logger name="org.elasticsearch.discovery" level="WARN"/> <!-- Metrics sent to Elasticsearch or to the slf4j reporter are prefixed with "punchplatform". Or something else if you change the platform_id in your punchplatform.properties. --> <logger name="punchplatform" level="WARN"/> </loggers>