HOWTO activate logging
Why do that¶
To debug or understand better the behavior of a PML job.
Prerequisites¶
You need a punchplatform-standalone installed with spark. The easiest to work with Spark (and PML) is to launch the job in foreground using the punchplatform-analytics.sh command. For example if you have a job defined in the [job.pml] file, use the following command
1 | $ punchplatform-analytics.sh --job job.pml --spark-master local[*] --deploy-mode client |
What to do¶
Configure the Spark log4j.properties¶
Spark use log4j. It is located in
1 | $ punchplatform-standalone-*/external/spark-x.y.z-bin-hadoop2.7/conf/log4j.properties |
There activate the loggers you need. For example should you need debug logging for the punch stage :
1 2 3 4 5 6 7 | log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %t %c: %m%n log4j.logger.org.thales.punch.ml=INFO log4j.logger.org.thales.punch.libraries.punchlang=DEBUG |
!!! warning By default the delivered log4j.properties is configured only with ERROR level so as to limit standard output to the most relevant spark outputs.""
Important loggers¶
- org.apache.spark : the Spark loggers
- org.thales.punch : the various punchplatform loggers
- org.thales.punch.ml : these are the legacy classes. They will progressively vanish.