Redis Rate Limiting: INCR, Sliding Window, Leaky Bucket & Token Bucket

Rate limiting is one of the most common Redis use cases. Whether you're protecting an API from abuse or throttling background jobs, Redis gives you the primitives to build it. This guide walks through four algorithms with real commands.

The Problem

Without rate limiting, a single user or bot can:

  • Exhaust your API server resources.
  • Trigger expensive database queries thousands of times per second.
  • Rack up cloud bills from runaway processes.

Algorithm 1: Fixed Window Counter (INCR + EXPIRE)

The simplest approach. Count requests in a fixed time window.

How It Works

  1. Use a key like ratelimit:<user_id>:<window>.
  2. INCR the counter on each request.
  3. If the counter exceeds the limit, reject the request.
  4. Set EXPIRE on the first request to define the window.

Commands

SET ratelimit:user:1001:window "0" EX 60
INCR ratelimit:user:1001:window
INCR ratelimit:user:1001:window
INCR ratelimit:user:1001:window
GET ratelimit:user:1001:window
TTL ratelimit:user:1001:window

Atomic Version with MULTI

To avoid race conditions between INCR and EXPIRE:

MULTI
INCR ratelimit:user:1001:min
EXPIRE ratelimit:user:1001:min 60
EXEC

Pitfall: Boundary Burst

A user can send 100 requests at second 59 of window 1, and 100 more at second 0 of window 2 — effectively 200 requests in 2 seconds. The sliding window fixes this.

Algorithm 2: Sliding Window Log

Track each request timestamp in a sorted set. This gives precise per-second rate limiting.

How It Works

  1. Add the current timestamp as both score and member in a ZSET.
  2. Remove entries older than the window.
  3. Count remaining entries.
  4. If count exceeds limit, reject.

Commands

ZADD ratelimit:user:1001:sliding 1739600001 "req:1739600001"
ZADD ratelimit:user:1001:sliding 1739600002 "req:1739600002"
ZADD ratelimit:user:1001:sliding 1739600030 "req:1739600030"

ZREMRANGEBYSCORE ratelimit:user:1001:sliding 0 1739599940

ZCARD ratelimit:user:1001:sliding

Pitfall

Memory usage grows with request volume. Each request stores a member in the ZSET. For high-traffic endpoints, this can consume significant memory. Set a TTL on the ZSET key as a safety net:

EXPIRE ratelimit:user:1001:sliding 120

Algorithm 3: Sliding Window Counter (Hybrid)

A memory-efficient compromise between fixed window and sliding window log. Uses two fixed windows and interpolates.

How It Works

Assume a 60-second window with a limit of 100 requests:

  • Current window count: 40 (started 20 seconds ago)
  • Previous window count: 80

Weighted count = previous × ((window - elapsed) / window) + current = 80 × (40/60) + 40 = 93.3

Since 93.3 < 100, the request is allowed.

Commands

SET ratelimit:user:1001:prev "80" EX 120
SET ratelimit:user:1001:curr "0" EX 60
INCR ratelimit:user:1001:curr
INCR ratelimit:user:1001:curr
GET ratelimit:user:1001:prev
GET ratelimit:user:1001:curr

The interpolation happens in application code. Redis just stores the counters.

Algorithm 4: Leaky Bucket

Requests enter a "bucket" and are processed at a fixed rate. If the bucket is full, new requests are rejected.

How It Works

Model the bucket as a counter that decrements over time:

  • Each request increments the counter.
  • A background process (or lazy evaluation) decrements it at a fixed rate.
  • If counter > bucket size, reject.

Commands (Simplified with Lua)

In practice, leaky bucket is implemented with a Lua script for atomicity. Here's the conceptual model:

SET bucket:user:1001:level "0"
SET bucket:user:1001:last_check "1739600000"

INCR bucket:user:1001:level
GET bucket:user:1001:level

The leak calculation happens in application code or a Lua script:

leaked = (now - last_check) * leak_rate
new_level = max(0, current_level - leaked) + 1
if new_level > capacity: reject

Use Case

Best for smoothing out bursty traffic into a steady stream. Common in API gateways.

Algorithm 5: Token Bucket

Tokens are added to a bucket at a fixed rate. Each request consumes one token. If no tokens are available, the request is rejected.

How It Works

  • Bucket starts full (e.g., 10 tokens).
  • Each request removes a token.
  • Tokens are replenished at a fixed rate (e.g., 1 per second).
  • Allows short bursts up to the bucket capacity.

Commands

SET bucket:user:1001:tokens "10"
SET bucket:user:1001:last_refill "1739600000"

DECR bucket:user:1001:tokens
GET bucket:user:1001:tokens

Refill logic (in application or Lua):

elapsed = now - last_refill
new_tokens = min(capacity, tokens + elapsed * refill_rate)
if new_tokens < 1: reject
else: new_tokens -= 1

Use Case

Best when you want to allow bursts but enforce an average rate. Used by AWS, Stripe, and most cloud APIs.

Comparison Table

AlgorithmPrecisionMemoryBurst HandlingComplexity
Fixed WindowLowVery LowAllows boundary burstSimple
Sliding LogHighHighNo burstMedium
Sliding CounterMediumLowMinimal burstMedium
Leaky BucketHighLowSmooths burstsHigh
Token BucketHighLowAllows controlled burstHigh

Pitfalls

  1. Race conditions: Always use MULTI/EXEC or Lua scripts for atomic operations.
  2. Clock skew: In distributed systems, use Redis server time (TIME command) instead of client time.
  3. Memory leaks: Always set TTL on rate limit keys. Forgotten keys from inactive users accumulate.
  4. Over-engineering: Start with fixed window (INCR + EXPIRE). Only upgrade to sliding window or token bucket when you actually hit boundary burst issues.

Try It in the Editor

Head to the Redis Online Editor and experiment:

SET ratelimit:user:1001:window "0" EX 60
INCR ratelimit:user:1001:window
INCR ratelimit:user:1001:window
INCR ratelimit:user:1001:window
GET ratelimit:user:1001:window
TTL ratelimit:user:1001:window

ZADD ratelimit:sliding 100 "req:1" 200 "req:2" 300 "req:3"
ZCARD ratelimit:sliding
ZRANGEBYSCORE ratelimit:sliding 150 +inf

Try incrementing the counter past your limit and see how the logic works. Then try the ZSET approach for sliding window precision.