Is Standard Java Logging Dead? Log4j vs Log4j2 vs Logback vs java.util.logging

 ● 29th Nov 2016

5 min read

The Java log levels showdown: SEVERE FATAL ERROR OMG PANIC

Capitalized log levels induce high levels of stress. What if, instead of ERROR we’d just use “oops”? On a more serious note, we’ve recently ran a huge data crunch over GitHub’s top Java projects and the logging statements they use, revealing the log level breakdown of the average Java project.
In this post, we’ll explore the resulting data set from another angle, shed some more light on the dataset, and put the focus on the use of standard java.util.logging levels versus more popular frameworks like Log4j (+ Log4j 2), and Logback.
Step right in.

[This blog post is included as chapter 2 of our free Guide to Java Logging in Production. Download the full eBook here.]

Meet the players

Logging utilities can be roughly divided to 2 categories: the logging facade and the logging engine.
As far as logging facades go, you pretty much have 2 choices: slf4j and Apache’s commons-logging. In practice, 4 out of 5 Java projects choose to go with slf4j. Based on data from the top Java libraries in 2016 on Github. The motivation for using a logging facade is pretty definitive and straightforward, an abstraction on top of your logging engine of choice – allowing you to replace it without changing the actual code and logging statements.
As to the logging engine, the most popular picks are Logback, which is an evolved version of Log4j, Log4j itself, and its new version since the development was passed on to the Apache Software Foundation, Log4j2. Trailing behind is Java’s default logging engine, java.util.logging aka JUL.

Pointing fingers and calling names

On the “superficial” side of things, each of the logging frameworks has slightly different names for their logging levels.
Log Levels
In the rare case where slf4j is used with java.util.logging, the following mapping takes place:
Another thing to notice here is that Logback and java.util.logging have no FATAL equivalent. Behind those error names, are simple integer values, that help control the logging level in a running applications. Each library also contains values for OFF and ALL, which basically set the logger level to actually transmit everything, or nothing. Setting a logger level at WARN for instance, would only log WARN messages and above – Its practically the default setting for production environments.
btw, one of the cool things about the tool that we’re building, is that you can get log messages lower than WARN in production, even if you’ve set the logger level to WARN. Check out this video for a quick (25 sec) demonstration.

How does the level naming breakdown look in practice?

For the data crunch, we focused on the top starred Java projects with at least 100 logging statements in either of the methods. Examining the data set of projects, here’s what we found:
[optin-monster-shortcode id=”pdrcon10dd6o0z6ejpjv”]
Logging Levels by Type
Only 4.4% of projects exclusively used the java.util.logging naming scheme.
The average non jul logging project, looked like this (examining 1,313 projects):
The Average Java Log Level Distribution
To look at the average java.util.logging project, we filtered it down to include only projects who had at least 100 statements from levels that don’t overlap with the non-JUL naming scheme (WARNING and INFO). This left us with a smaller dataset, so it might not be big enough to make definite conclusions from:
JUL Logging Average
With that said, it looks like in both situations, roughly ⅔ of logging statements are disabled in production, since only WARN and above are activated in that case.
Fun fact: As an extra datapoint, we also looked at ALL / OFF levels. Turns out only 8.6% of the projects examined used them both.

How did we reach the data?

The starting point for this research is the GitHub archive, and its datasets on Google BigQuery. We wanted to focus on qualified Java projects, excluding android, sample projects, and simple testers. A natural choice was to look at the most starred projects, taking in the database of the top 400,000 repositories.
We ended up with 15,797 repositories with Java source files, 4% of the initial dataset. But it didn’t stop there. Looking at the number of logging statements, we decided to only focus on projects with at least 100 different statements. The dataset is available right here.
We believe this to be a fairly representative sample of what we were trying to achieve. For the full walkthrough and the steps we took to reach the data, including the exact SQL queries, check out the last part in this post.

Final Thoughts

This post stresses out that java.util.logging is, well, practically dead. Most serious projects choose to go with 3rd party logging frameworks. Did you find anything else that we might have missed in the dataset? Do you have other interesting questions that can be answered through this or similar data?
Feel free to suggest your ideas in the comment section below.

Alex is the Director of Product Marketing at OverOps. As an engineer-turned-marketer, he is passionate about transforming complex topics into simple narratives and using his experience to help software engineering navigate their way through the crowded DevOps landscape.

Troubleshooting Apache Spark Applications with OverOps OverOps’ ability to detect precisely why something broke and to see variable state is invaluable in a distributed compute environment.
Troubleshooting Apache Spark Applications with OverOps

Next Article

The Fastest Way to Why.

Eliminate the detective work of searching logs for the Cause of critical issues. Resolve issues in minutes.
Learn More