Skyshift technologies Blog

Implementing backpressure in Spring Boot Reactive Programming

In today's web applications, managing high data flow efficiently is crucial for performance and scalability. In this article, we explore how to control data flow in reactive systems. Backpressure helps a system handle different data production and consumption rates, preventing overload and maintaining stability. Using Spring Boot, a framework for Java-based microservices, with Project Reactor, developers can create applications that remain responsive under heavy load. This guide explains the importance of backpressure and how to implement it in Spring Boot to build stable and high-performance reactive applications.

1. What is Backpressure in Reactive Programming?

In reactive streams, backpressure is a crucial concept. It allows a data consumer (subscriber) to communicate with the data producer (publisher) about the amount of data it can handle at a time. This prevents overwhelming scenarios and ensures a smooth and manageable data flow.

Think of drinking at a water fountain and the water comes out too fast. Backpressure is being able to hold that flow so you can comfortably drink without being overwhelmed.

2. Why is backpressure necessary?

We define it as a “stop” button. Without backpressure, systems can run into some of the following issues:

Memory overruns: When the incoming data stream is faster than the application can process, the server may run out of memory
Poor performance: As the system struggles to manage the irregular flow, overall performance can decrease, causing slower response times and a poor user experience.
System crashes: In extreme cases, such as unexpected high demand occurs on our application, the system may crash.

3. Backpressure in Project Reactor

In Project Reactor, a Flux or Mono typically sends data , it doesn't just send all the data at once. Within this pattern, instead, the part that receives the data (the subscriber) asks for only as much of that data as it can deal with at one time. This asking is done through a method called request(n).

Here's how it works:

Request(n): The subscriber can request a specific number of items. In this way, the subscriber informs the producer about its current capacity.
Dynamic adjustment: As the subscriber processes data, it can dynamically adjust its requests based on current load, processing speed, and other factors.
Propagated backpressure: Backpressure is not just between one producer and one consumer, but is used in entire systems to keep a balance. When a downstream Subscriber experiences backpressure (by requesting fewer items or none at all), this backpressure is propagated up the chain to any upstream operators (like transforms, filters, etc.) and finally to the Publisher itself. This ensures that the entire chain adjusts its behavior in response to the available capacity of the final Subscriber.

4. Detecting overwhelmed consumers

In an ideal scenario, data flows seamlessly from senders to receivers. However, there are instances when receivers become overwhelmed by the volume of incoming data. It's crucial to identify these situations and implement appropriate solutions. Here are some indicators to watch for while monitoring application performance:

Observing latency and throughput: An increase in processing time or a decrease in the amount of data processed over time can indicate potential issues.
Error rates and patterns: A high frequency of errors, or specific types of errors, can signal that the system is being overwhelmed by too much data.
Resource utilization: Excessive usage of CPU or memory resources by the consumer may suggest that it is receiving more data than it can handle.

5. Strategies for handling backpressure

There are several strategies to manage backpressure in a reactive system:

Buffering: Temporarily holding data until the consumer is ready. This strategy works well if the overload is short-lived or intermittent.
Dropping data: In some cases, particularly with real-time data, it might be acceptable to drop some data to keep up with the flow
Batching: Accumulating data into larger, less frequent batches can reduce overhead and allow consumers to catch up.
Rate limiting: Limiting the rate at which the producer sends data to match the consumer’s capacity.

6. Implementing backpressure in the project

In this section, we will look at two variations of an API used to fetch user data, each with a different approach to handling large streams of data:

Normal API (Without Backpressure): This API returns user data without any backpressure control. It simply streams the data to the client as requested.

URL: /users
Reactive API (With Backpressure): This API implements backpressure, meaning it adjusts the flow of data based on the client's ability to process it, preventing the client from being overwhelmed.

URL: /users/stream


@RestController
@RequestMapping("/users")
public class UserController {

  private static final Logger log = LoggerFactory.getLogger(UserController.class);
  private final UserRepository userRepository;

  public UserController(UserRepository userRepository) {
      this.userRepository = userRepository;
  }

  @GetMapping
  public Flux getAllUsers() {
      long start = System.currentTimeMillis();
      return userRepository.findAll()
              .doOnSubscribe(subscription -> log.debug("Subscribed to User stream!"))
              .doOnNext(user -> log.debug("Processed User: {} in {} ms", user.name(), System.currentTimeMillis() - start))
              .doOnComplete(() -> log.info("Finished streaming users for getAllUsers in {} ms", System.currentTimeMillis() - start));
  }

  @GetMapping("/stream")
  public Flux streamUsers() {
      long start = System.currentTimeMillis();
      return userRepository.findAll()
              .onBackpressureBuffer()  // Buffer strategy for back-pressure
              .doOnNext(user -> log.debug("Processed User: {} in {} ms", user.name(), System.currentTimeMillis() - start))
              .doOnError(error -> log.error("Error streaming users", error))
              .doOnComplete(() -> log.info("Finished streaming users for streamUsers in {} ms", System.currentTimeMillis() - start));
  }
}

Use the following bash script to simulate a load test. The script will send multiple requests to the API and measure response times, allowing you to compare the behavior of the normal API with and without backpressure.

You can modify the script to test both endpoints, /users (normal API) and /users/stream (API with backpressure), and then compare metrics such as response times and system load.


#!/bin/bash
#A simple script to create load by sending multiple concurrent requests to the server.

#Define the number of requests
REQUESTS=300

#The endpoint to test
URL="http://localhost:8080/users/stream"

for i in $(seq 1 $REQUESTS)
do
  curl "$URL" &  #The ampersand at the end sends the request in the background, allowing for concurrency
done

wait #Wait for all background jobs to finish
echo "All requests sent."

7. Analysis

In this section, we will analyze the results from the load tests performed and see what they might indicate:

get all users endpoint logs

The /getAllUsers endpoint quickly retrieves all user data without managing the flow of incoming requests. Initially, it responds fast, but as more people access it, each request starts taking progressively longer—sometimes up to twice as long. This happens when the system becomes overwhelmed by traffic and lacks proper request management. While it continues to accept more requests, the increasing load causes each task to take longer, as the system becomes too busy to process them efficiently.

stream users endpoint logs

The /streamUsers endpoint has a way to manage data called backpressure, using .onBackpressureBuffer(). Initially, it handles requests at a steady pace. As traffic increases, the response time for each request gradually rises—slower than the /getAllUsers endpoint, but still increasing. This slower rise in response time indicates that the system is actively managing the data flow, adjusting to the load to avoid becoming overwhelmed.

8. Conclusion

Based on the analysis, we can draw conclusions:

Initial performance: Both endpoints start with similar processing times for streaming users. This shows us both endpoints work pretty much the same under normal conditions.
Performance under load: When the load increases, the response time to finish streaming all users increases in both cases. However, the increase in the /getAllUsers endpoint is more rapid. This shows as it becomes overwhelmed more quickly. The /streamUsers endpoint shows a more gradual increase in time, suggesting the backpressure is allowing the system to handle the load more gracefully.

Skyshift

TECHNOLOGIES