The Top 100 Java Libraries in 2018 – Based on 277,975 Source Files

 ● 01st Nov 2018

7 min read

The Top 100 Java Libraries of 2019 are here!

Time flies when you’re having fun, and the past year was pretty crazy. It included SpaceX successfully launching Falcon Heavy, their partially reusable heavy-lift launch vehicle (Yup, this happened in 2018), continued with Apple becoming the world’s first public company to achieve a market capitalization of $1 trillion, and we even created embryo-like structures from stem cells alone, without using egg or sperm cells.
A lot went on during the past year, and still one of the things that excite us the most is our annual Java libraries data crunch. Big shout out to Guy Castel who helped us pull and crunch the numbers so we could present you with the top 100 Java libraries – 2018 edition. Let’s check it out.

There’s a New Java Queen!

top 20 Java libraries

The top 20 Java libraries for 2018

Ladies and gentlemen, a dramatic turnover! After 3 years of JUnit being the undisputed leader of our top Java libraries chart, this year we have a new library sitting on the Java throne: fasterXML/Jackson, also known as JSON for Java. At second place we can find Apache Hadoop, the open-source software for scalable and distributed computing, when our old pal JUnit dropped down to third place, taking home the bronze.
Closing our top 5 libraries are the extended JUnit abstract Runner class and Spring Framework, showing an impressive rise since last year.
Among the popular names in our top 20 libraries, we can find Eclipse’s Jetty, Apache Shiro (a Java security framework), Netty (asynchronous event-driven network application framework), and Google’s Guice library, a dependency injection framework for Java 6 and above.
A few other libraries caught our attention include ch.qos.logback, a successor to log4j project, and org.openjdk.jmh, a Java harness for building, running, and analyzing nano/micro/macro benchmarks in Java and other JVM languages.
We can also find Selenium, which allows you to add automation to web applications and browsers. Another interesting library comes from Alibaba, the Chinese conglomerate specializing in e-commerce, retail, Internet, AI and technology. Unfortunately, the documentation was in Chinese, so we can’t be sure as to what the exact purpose of this library is ¯\_(ツ)_/¯.

Last Year Winners: Where Are They Now?

To see how much has changed during the last year, we took a look at 2017 top 20 Java libraries and searched for them in our current list. We already saw the decline in using JUnit, but other names were on top and now… they’re not.
The first two libraries we looked at are Mockito, an open source testing framework, and slf4j, the logging facade for Java. Both were in the top 5 libraries of 2017, and both dropped down the chart. You’ll find Mockito at #23, and slf4j is further down at #25. Yikes.
Last year we had our money on Hamcrest (#6), a framework that helps write tests within JUnit and jMock. However, it turned out that developers are still searching for the ideal testing environment, and Hamcrest dropped down to the 37th position.
One trend that did carry out from 2017 to 2018 is the wide use of Apache libraries. This year we have no less than 36 different libraries, such as:

#14 org.apache.lucene
#16 org.apache.commons.lang
#20 org.apache.maven
#38 org.apache.ignite
#43 org.apache.thrift
#58 org.apache.commons.compress
#61 org.apache.kafka
#65 org.apache.spark
#68 org.apache.calcite
And others. Wowza.

Looking Bottom-Up

The main thing we saw in this year results is that everything can change, and libraries which are at the bottom might end up on top the following year. That’s why we took a look at some of the less popular libraries on our 2018 chart, and make a mental note to follow up on them during 2019:

  • #85 org.objectweb.asm – A simple API for decomposing, modifying, and recomposing binary Java classes
  • #86 – Google Guava Primitive Types
  • #87 com.datastax.driver – DataStax Java Driver for Apache Cassandra
  • #88 org.json – The data interchange format
  • #89 org.apache.commons.math3 – The Apache Commons Mathematics Library
  • #90 – A latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries.

View the complete list of top 100 Java libraries, here.


And Now For Something Completely Different

Each year we pull the top Java libraries from GitHub, and start crunching the numbers. A big chunk of this crunching is done manually, since there’s a need to differentiate between the various folders under one owner. That’s why there are a number of libraries in our list, as well as org.springframework, org.apache and so on.
This year we decided to take a broader look at the top Java libraries to see who’s the top “owner”. We combined all of the org.eclipse.XX, org.codehaus.YY, io.netty.ZZ and others under one roof, and came up with the top 20 Java names in GitHub:

Top 20 Java libraries (combined) for 2018

Top 20 Java libraries (combined) for 2018

There’s some difference between the detailed and summed up lists. Apache took the first place among all libraries, with the impressive number of 4,180 repositories while Spring came in second place with “only” 1,282 repositories. Google came in third place, followed by JUnit and Eclipse. At this point we can see a significant decline in the number of repositories used, as we dropped from 4,000+ to around 600.
Another interesting takeaway is that this list includes some libraries that we didn’t see in the original count, such as:

  • Hibernate, that had a lot of scattered libraries and didn’t make the top 100 cut
  • BouncyCastle, a collection of APIs used in cryptography

What does it mean? Well, we can play with these numbers all day long, and each time get a different and interesting result. In our original count, we wanted to give you the most detailed view across top Java libraries, to help you learn more about the market, or even help you decide which library you should use.
You can view both the detailed and combined lists, here.

How Did We Do It?

Our method of pulling the data and number is similar to what we did last year. With the kind help of Guy Castel from OverOps R&D team, we once again turned to our friend Google BigQuery, along with GitHub’s API. We pulled the top 1,000 repositories, and out of those we extracted the Java libraries these repositories use.
Out of the 277,975 Java source files we pulled from GitHub, we filtered out Android, Arduino, duplicated and deprecated repos. At this point we were left with 28,021 Java source files. After slicing, dicing and analyzing, we got our final top 100 list. As you probably know, code speaks louder than words, so let’s see how Guy did it.
First, we wanted to create the top repositories table:

Now that we had the names of the top repositories, we pulled all of their content:

After we had the source files for each project, we wanted to pull all of their unique import statements. In the following query, we extract the package name, and made sure it is counted just once per project:

The final step was filtering the results again, making sure that there’s no Android, Arduino, deprecated or standard Java libraries that might have slipped through our query-cracks:

Which gave us the top Java libraries of 2018. Whoo. 🎉

Final Thoughts

Java went through some changes over the last few months. It started with the move towards a 6 month release cycle, and recently we heard that only OpenJDK builds will be freely accessible, while Java SE 8 public updates will require a commercial license for business, commercial or production use.
The changes in Java affect the developers, which we can see through the shift in popularity of GitHub’s top Java libraries. The recent news about charging for Java SE updates might lead to an increase in the use of OpenJDK-related libraries, or maybe even shift the numbers towards enterprise related libraries. The one thing we do know is that we can’t wait for next year to see how, and if, anything changed.
Found other interesting libraries within our spreadsheet? We’d love to hear about them in the comments below.


Henn is a marketing manager at OverOps covering topics related to Java, Scala and everything in between. She is a lover of gadgets, apps, technology and tea.

Troubleshooting Apache Spark Applications with OverOps OverOps’ ability to detect precisely why something broke and to see variable state is invaluable in a distributed compute environment.
Troubleshooting Apache Spark Applications with OverOps

Next Article

The Fastest Way to Why.

Eliminate the detective work of searching logs for the Cause of critical issues. Resolve issues in minutes.
Learn More