Reactive Streams and the Weird Case of Back Pressure

 ● 06th Dec 2016

5 min read

Everything you need to know about Reactive Streams and Lightbend

There are a lot of initiatives that aim to improve workflows for developers, but only some actually makes it to a living and breathing concept. One of them is Reactive Streams.
Its goal is to to provide a standard for asynchronous stream processing with non-blocking back pressure. Flow-wise, it allows the creation of many implementations, which will be able to preserve the benefits and characteristics of async programming across the whole processing graph of a stream application.
Let’s see what it means and how a huge project like this gets even started.

The Building Blocks of Reactive Streams

This concept was first ignited by Lightbend in 2013, soon to be joined by other companies in pursuit of creating a new interoperable standard that can work with any language. After a bit over a year of incubation as Reactive Streams, the interfaces/contracts have been included verbatim in the JDK and are scheduled for arrival in JDK9 – which right now is scheduled for July 2017. So while waiting for the JDK to include the Reactive Streams interfaces, some 3rd party libraries already implement them and you can use them and their exact semantics before they’re included in the JDK.
But what’s a Reactive Stream and how do you actually use it? To answer this question, we had a chat with Konrad Malawski, senior developer on Lightbend’s core Akka team. Konrad has contributed to various projects in Akka, including Streams, and implemented the technology compatibility kit (TCK) for the Reactive Streams.

From Reactive Streams to Akka Streams

The idea behind Reactive Streams started when Lightbend wanted to set-up an industry wide collaboration for solving back-pressured asynchronous stream processing. Back pressure describes the build-up of data, when the incoming task rate is higher than the system’s ability to process them – resulting in a buffer of unhandled data.
The company wanted to introduce modern reactive systems next to legacy systems, which often could not cope with the high throughput the greenfield projects would require.
For example, if an application that produces data streams would be rewritten using reactive technology, it would be much faster than the other applications which are consuming this data. That’s great, but the high volumes of data might lead to it being unstable, which in result might affect the data or the end-users.
On a smaller scale the same thing happens within any asynchronous system as well, which is what most, if not all, reactive applications are.
Enter Reactive Streams. It allows the developers to have a well throttled (back-pressured) flow of data throughout such systems. It gained popularity, and since it became a standard various independent libraries speak the same semantics, and can seamlessly connect to each other.
In case you’re wondering what are the driving forces behind Akka Streams, you can read our interview with Dr. Roland Kuhn, former Akka Tech Lead at Lightbend.

Specification to Implementation

Konrad states that in Akka Streams, Lightbend made a choice to lay the foundations for future monitoring and debugging utilities, since it’s important to think of the next steps when implementing Streams, due to the effort and “pain” of adding it afterwards.
“This decision came from our experience with implementing Actor based architectures, where we learned that their performance (multiple millions of messages per second) is usually vastly superior to their non-reactive counterparts.”
“However, the actual difficulty of asynchronous distributed systems lies with understanding and operationalizing these systems. So while we’re still focused on the performance aspect, we’re about to work on superior ‘understandability’ tooling, that with other libraries would simply not be possible to “slap on” as an afterthought.”
Konrad has been involved in the standard as well as the implementation of Akka Streams, that provides various operations (like filter, map, mapConcat, balance, merge, route) as well as a collection of connectors (codenamed Alpakka) to external systems such as Kafka, Cassandra, SQL Databases, JMS message queues and more.
He adds that: “In a way you can look from the perspective of attempting to fill the gap that Camel has left open in the integration space – reactive, high-performance, asynchronous integration between systems. We also use them to implement our HTTP server and remoting infrastructure.”

The Future of Streams

Lightbend is now in a process of exponential growth, in which the awareness of message driven architectures is increasing. Nowadays, the company is implementing HTTP/2 inside a fully Reactive HTTP Server (Akka HTTP), and improving performance of the remoting subsystem by pushing 700k messages per second to a typical EC2 environment.
According to Konrad, “This combination of leading HTTP support across our tools and insanely fast communication inside the cluster, allows us to build applications in ways that were not possible before.”
He also adds that “With high throughput and low latency of within-cluster messaging, we’re able to spend a few hops between servers, where otherwise you would not be able to afford that simply because of the network call overheads. This allows our systems to scale way more efficiently than in other all-the-way-JSON-and-HTTP architectures.”
To learn more about the principles and aspects of the Reactive initiative, download the “Why Reactive?” eBook by Konrad himself.

Final Thoughts

We at OverOps are always intrigued by new technologies and initiatives that aim to make developers’ lives easier. Reactive Streams was built as a community initiative in order to solve the issue of back-pressure, and now it’s about to become part of JDK 9. That’s pretty impressive, and we’re all for it.

Henn is a marketing manager at OverOps covering topics related to Java, Scala and everything in between. She is a lover of gadgets, apps, technology and tea.

Troubleshooting Apache Spark Applications with OverOps OverOps’ ability to detect precisely why something broke and to see variable state is invaluable in a distributed compute environment.
Troubleshooting Apache Spark Applications with OverOps

Next Article

The Fastest Way to Why.

Eliminate the detective work of searching logs for the Cause of critical issues. Resolve issues in minutes.
Learn More