Logging

Samza uses SLF4J for all of its logging. By default, Samza only depends on slf4j-api, so you must add an SLF4J runtime dependency to your Samza packages for whichever underlying logging platform you wish to use.

Log4j

The hello-samza project shows how to use log4j with Samza. To turn on log4j logging, you just need to make sure slf4j-log4j12 is in your SamzaContainer’s classpath. In Maven, this can be done by adding the following dependency to your Samza package project.

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-log4j12</artifactId>
  <scope>runtime</scope>
  <version>1.6.2</version>
</dependency>

If you’re not using Maven, just make sure that slf4j-log4j12 ends up in your Samza package’s lib directory.

Log4j configuration

Samza’s run-class.sh script will automatically set the following setting if log4j.xml exists in your Samza package’s lib directory.

-Dlog4j.configuration=file:$base_dir/lib/log4j.xml

The run-class.sh script will also set the following Java system properties:

-Dsamza.log.dir=$SAMZA_LOG_DIR -Dsamza.container.name=$SAMZA_CONTAINER_NAME

These settings are very useful if you’re using a file-based appender. For example, you can use a daily rolling appender by configuring log4j.xml like this:

<appender name="RollingAppender" class="org.apache.log4j.DailyRollingFileAppender">
   <param name="File" value="${samza.log.dir}/${samza.container.name}.log" />
   <param name="DatePattern" value="'.'yyyy-MM-dd" />
   <layout class="org.apache.log4j.PatternLayout">
    <param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss} %c{1} [%p] %m%n" />
   </layout>
</appender>

Setting up a file-based appender is recommended as a better alternative to using standard out. Standard out log files (see below) don’t roll, and can get quite large if used for logging.

Changing log levels

Sometimes it’s desirable to change the Log4J log level from INFO to DEBUG at runtime so that a developer can enable more logging for a Samza container that’s exhibiting undesirable behavior. Samza provides a Log4j class called JmxAppender, which will allow you to dynamically modify log levels at runtime. The JmxAppender class is located in the samza-log4j package, and can be turned on by first adding a runtime dependency to the samza-log4j package:

<dependency>
  <groupId>org.apache.samza</groupId>
  <artifactId>samza-log4j</artifactId>
  <scope>runtime</scope>
  <version>${samza.version}</version>
</dependency>

And then updating your log4j.xml to include the appender:

<appender name="jmx" class="org.apache.samza.logging.log4j.JmxAppender" />

Log Directory

Samza will look for the SAMZA_LOG_DIR environment variable when it executes. If this variable is defined, all logs will be written to this directory. If the environment variable is empty, or not defined, then Samza will use $base_dir, which is the directory one level up from Samza’s run-class.sh script. This environment variable can also be referenced inside log4j.xml files (see above).

Garbage Collection Logging

Samza will automatically set the following garbage collection logging setting, and will output it to $SAMZA_LOG_DIR/gc.log.

-XX:+PrintGCDateStamps -Xloggc:$SAMZA_LOG_DIR/gc.log

Rotation

In older versions of Java, it is impossible to have GC logs roll over based on time or size without the use of a secondary tool. This means that your GC logs will never be deleted until a Samza job ceases to run. As of Java 6 Update 34, and Java 7 Update 2, new GC command line switches have been added to support this functionality. If GC log file rotation is supported by the JVM, Samza will also set:

-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10241024

YARN

When a Samza job executes on a YARN grid, the $SAMZA_LOG_DIR environment variable will point to a directory that is secured such that only the user executing the Samza job can read and write to it, if YARN is securely configured.

STDOUT

Samza’s ApplicationMaster pipes all STDOUT and STDERR output to logs/stdout and logs/stderr, respectively. These files are never rotated.

Reprocessing »