Skip to content

HOWTO activate logging

Why do that

To debug or understand better the behavior of a PML job.

Prerequisites

You need a punchplatform-standalone installed with spark. The easiest to work with Spark (and PML) is to launch the job in foreground using the punchplatform-analytics.sh command. For example if you have a job defined in the [job.pml] file, use the following command

1
$ punchplatform-analytics.sh --job job.pml --spark-master local[*] --deploy-mode client

What to do

Configure the Spark log4j.properties

Spark use log4j. It is located in

1
$ punchplatform-standalone-*/external/spark-x.y.z-bin-hadoop2.7/conf/log4j.properties

There activate the loggers you need. For example should you need debug logging for the punch stage :

1
2
3
4
5
6
7
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %t %c: %m%n
log4j.logger.org.thales.punch.ml=INFO
log4j.logger.org.thales.punch.libraries.punchlang=DEBUG

!!! warning By default the delivered log4j.properties is configured only with ERROR level so as to limit standard output to the most relevant spark outputs.""

Important loggers

  • org.apache.spark : the Spark loggers
  • org.thales.punch : the various punchplatform loggers
  • org.thales.punch.ml : these are the legacy classes. They will progressively vanish.