Benchmark: How Misusing Streams Can Make Your Code 5 Times Slower

 ● 24th Nov 2015

6 min read

How Java 8 lambdas and streams perform compared to longstanding implementations?

Lambda expressions and streams received a heartwarming welcome in Java 8. These are by far the most exciting features making their way to Java in a long long time. The new language features allow us to adopt a more functional style in our code and we had lots of fun playing around with them. So much fun that it should be illegal. Then we got suspicious, and decided to put them to the test.
We’ve taken a simple task of finding a max value in an ArrayList and tested longstanding implementations versus new methods that became available with Java 8. Honestly, the results were quite surprising.
Psst! Concerned about application performance? OverOps continuously analyzes application quality and provides code-level insight for all new or increasing errors and slowdowns at every stage of the software delivery lifecycle.

Imperative vs Functional Style Programming in Java 8

We like getting straight down to the point, so let’s take a look at the results. For this benchmark, we’ve created an ArrayList, populated it with 100,000 random integers and implemented 7 different ways to go through all the values to find the maximum. The implementations are divided into 2 groups: Functional style with new language features introduced in Java 8 and an imperative style with longstanding Java methods.
Here’s how long each method took:
Java 8 Functional Benchmark
** The biggest error recorded was 0.042 on parallelStream, full results output is available at the bottom of this post


  1. Whoops! Implementing a solution with ANY of the new methods Java 8 offers caused around a 5x performance hit. Sometimes using a simple loop with an iterator is better than getting lambdas and streams into the mix. Even if it means writing a few more lines of code and skipping on that sweet syntactic sugar.
  2. Using iterators or a for-each loop is the most effective way to go over an ArrayList. Twice as better than a traditional for loop with an index int.
  3. Among the Java 8 methods, using parallel streams proved to be more effective. But watchout, in some cases it could actually slow you down.
  4. Lambas took their place in-between the stream and the parallelStream implementations. Which is kind of surprising since their implementation is based on the stream API.
  5. [EDIT] Things are not always as they seem: While we wanted to show how easy it is to introduce errors in lambdas and streams, we received lots of community feedback requesting to add more optimizations to the benchmark code and remove the boxing/unboxing of integers. The second set of results including the optimizations is available at the bottom of this post.

[optin-monster-shortcode id=”ora6yvsrf1pjk2fmq5ee”]

Wait, what exactly did we test here?

Let’s have a quick look on each of the methods, from the fastest to the slowest:

Imperative Style

iteratorMaxInteger() – Going over the list with an iterator:

forEachLoopMaxInteger() – Losing the Iterator and going over the list with a For-Each loop (not to be mistaken with Java 8 forEach):

forMaxInteger() – Going over the list with a simple for loop and an int index:

Functional Style

parallelStreamMaxInteger() – Going over the list using Java 8 stream, in parallel mode:

lambdaMaxInteger() – Using a lambda expression with a stream. Sweet one-liner:

forEachLambdaMaxInteger() – This one is a bit messy for our use case. Probably the most annoying thing with the new Java 8 forEach feature is that it can only use final variables, so we created a little workaround with a final wrapper class that accesses the max value we’re updating:

btw, if we’re already talking about forEach, check out this StackOverflow answer we ran into providing some interesting insights into some of its shortcomings.
streamMaxInteger() – Going over the list using Java 8 stream:

[adrotate group=”11″]

Optimized Benchmark

Following the feedback for this post, we’ve created another version of the benchmark. All the differences from the original code can be viewed right here. Here are the results:
Java 8 Functional Benchmark

TL;DR: Summary of the changes

  1. The list is no longer Volatile.
  2. New method forMax2 removes field access.
  3. The redundant helper function in forEachLambda is fixed. Now the lambda is also assigning a value. Less readable, but faster.
  4. Auto-boxing eliminated. If you turn on auto-boxing warnings for the project in Eclipse, the old code had 15 warnings.
  5. Fixed streams code by using mapToInt before reduce.

Thanks to Patrick Reinhart, Richard WarburtonYan BonnelSergey Kuksenko, Jeff MaxwellHenrik Gustafsson and everyone who commented and on twitter for your contribution!
Another optimization that came from our community was that parallelStream max is faster than using reduce, as you can see in James Pittendreigh‏’s tweet:

The groundwork

To run this benchmark we used JMH, the Java Microbenchmarking Harness. If you’d like to learn more about how to use it in your own projects, check out this post where we go through some of its main features with a hands-on example.
The benchmark configuration included 2 forks of the JVM, 5 warmup iterations and 5 measurement iterations. The tests were run on a c3.xlarge Amazon EC2 instance (4 vCPUs, 7.5 Mem (GiB), 2 x 40 GB SSD storage), using Java 8u66 with JMH 1.11.2. The full source code is available on GitHub, and you can view the raw results output right here.
With that said, a little disclaimer: Benchmarks tend to be pretty treacherous and it’s super hard to get it right. While we tried to run it in the most accurate way, it’s always recommended to take the results with a grain of salt.

Final Thoughts

The first thing to do when you get on Java 8 is to try lambda expressions and streams in action. But beware: It feels really nice and sweet so you might get addicted! We’ve seen that sticking to a more traditional Java programming style with iterators and for-each loops significantly outperforms new implementations made available by Java 8. Of course it’s not always the case, but in this pretty common example, it showed it can be around 5 times worse. Which can get pretty scary if it affects a core part of your system or creates a new bottleneck.

Alex is the Director of Product Marketing at OverOps. As an engineer-turned-marketer, he is passionate about transforming complex topics into simple narratives and using his experience to help software engineering navigate their way through the crowded DevOps landscape.

Troubleshooting Apache Spark Applications with OverOps OverOps’ ability to detect precisely why something broke and to see variable state is invaluable in a distributed compute environment.
Troubleshooting Apache Spark Applications with OverOps

Next Article

The Fastest Way to Why.

Eliminate the detective work of searching logs for the Cause of critical issues. Resolve issues in minutes.
Learn More