hardSenior Backend EngineerCloud
Design a rate limiter for a distributed system — what algorithms exist and which would you choose for an API gateway?
Posted 18/04/2026
by Mehedy Hasan Ador
Question Details
At a cloud provider interview:
> "We need to implement rate limiting at our API gateway. Millions of users, distributed across 10 regions. Requirements: 100 req/s per user, with burst support up to 150. Which algorithm would you choose and why?"
> "We need to implement rate limiting at our API gateway. Millions of users, distributed across 10 regions. Requirements: 100 req/s per user, with burst support up to 150. Which algorithm would you choose and why?"
Suggested Solution
Rate Limiting Algorithms
1. Token Bucket (Recommended for burst support)
Bucket holds tokens (max: 150)
Refills at: 100 tokens/sec
Each request consumes 1 token
Bucket empty → reject
Timeline:
[150 tokens] → burst of 150 requests → [0 tokens] → reject
refill at 100/sec
[100 tokens after 1s] → accept 100 more
// Redis implementation
async function tokenBucket(userId: string): Promise<boolean> {
const key = bucket:${userId};
const [tokens, lastRefill] = await redis.hmget(key, "tokens", "lastRefill");
const now = Date.now();
const elapsed = (now - parseInt(lastRefill)) / 1000;
const newTokens = Math.min(
MAXTOKENS, // 150
parseInt(tokens) + elapsed * REFILLRATE // 100/sec
);
if (newTokens < 1) return false; // Rate limited
await redis.hmset(key, {
tokens: newTokens - 1,
lastRefill: now,
});
redis.expire(key, Math.ceil(MAXTOKENS / REFILLRATE));
return true; // Allowed
}
2. Sliding Window Counter
async function slidingWindow(userId: string): Promise<boolean> {
const now = Date.now();
const windowKey = rate:${userId}:${Math.floor(now / 1000)};
const count = await redis.incr(windowKey);
if (count === 1) await redis.expire(windowKey, 2); // 2s buffer
return count <= LIMIT; // 100
}
Algorithm Comparison
Distributed Rate Limiting Challenge
Region A: user has 50 tokens
Region B: user has 50 tokens
Total: 100 tokens across regions → user gets 200 req/s (2x limit!)
Solutions
Option A: Redis Cluster (shared state)// All regions point to same Redis cluster
const redis = new Redis.Cluster([
{ host: "redis-use1.example.com" },
{ host: "redis-euw1.example.com" },
]);
// Adds ~2-5ms latency per cross-region check
Option B: Regional limits + sticky sessions// Divide limit across regions: 100/10 = 10 req/s per region
// User typically hits 1-2 regions → gets 10-20 req/s
// Simple but imprecise
Option C: Async reconciliation (best for scale)// Each region allows request, logs to Kafka
// Background job checks global rate, revokes tokens if exceeded
// Eventual consistency — slight over-limit possible
Production Choice: Token Bucket with Redis Cluster
Nginx API Gateway config
limitreqzone $userid zone=api:10m rate=100r/s;
limitreq zone=api burst=50 nodelay;
burst=50 allows bursting to 150, nodelay serves immediately