Java Adaptive Rate Limiting

Adaptive rate limiting is a type of rate limiting algorithm that dynamically adjusts the request rate based on the current load on the server. It is useful when the server load is fluctuating and may exceed its capacity due to sudden spikes in traffic.

One way to implement adaptive rate limiting is to use the token bucket algorithm, where the bucket size is dynamically adjusted based on the current server load.

Here’s an example implementation of adaptive rate limiting using the token bucket algorithm in Java:

public class AdaptiveRateLimiter {
    private long lastRefillTime;
    private double refillRate;
    private double maxCapacity;
    private double currentCapacity;

    public AdaptiveRateLimiter(double refillRate, double maxCapacity) {
        this.refillRate = refillRate;
        this.maxCapacity = maxCapacity;
        this.currentCapacity = maxCapacity;
        this.lastRefillTime = System.currentTimeMillis();

    public synchronized boolean tryAcquire() {
        if (currentCapacity > 0) {
            return true;
        } else {
            return false;

    private void refill() {
        long currentTime = System.currentTimeMillis();
        double elapsedTime = (currentTime - lastRefillTime) / 1000.0;
        double capacityToAdd = elapsedTime * refillRate;
        currentCapacity = Math.min(currentCapacity + capacityToAdd, maxCapacity);
        lastRefillTime = currentTime;

In this implementation, the AdaptiveRateLimiter class has a refill rate and a maximum capacity, which are set during initialization. The tryAcquire() method is called to check if a new request can be made. It first calls the refill() method to refill the token bucket with any unused capacity since the last refill. If the current capacity is greater than zero, it decrements the current capacity and returns true to indicate that the request can be made. Otherwise, it returns false.

The refill() method calculates the amount of capacity that should be added to the bucket since the last refill based on the elapsed time and the refill rate. It then adds the capacity to the current capacity, up to the maximum capacity, and updates the last refill time. By dynamically adjusting the token bucket size based on the current server load, this implementation can effectively limit the request rate while allowing for burst traffic during periods of low load.