Design a Distributed Rate Limiter

Problem Statement

Design a rate limiter for a large-scale backend system.

The rate limiter should restrict the number of requests a client can make within a specified time window. The system must operate correctly in a distributed environment where multiple stateless application servers handle traffic.

Example limits:

100 requests per minute per user
1000 requests per hour per API key
Different limits per endpoint

Functional Requirements

Limit requests based on:
- User ID
- API Key
- IP Address
Support configurable rate limits (e.g., 10 requests/second, 100 requests/minute).
Support per-endpoint rate limiting.
Return an appropriate response (e.g., HTTP 429) when the limit is exceeded.
Allow dynamic updates to rate limit configurations.

Non-Functional Requirements

High availability.
Low latency — rate check should not significantly impact request processing time.
Horizontally scalable.
Fault tolerant.
Minimal operational overhead.

Problem Statement

Functional Requirements

Non-Functional Requirements

Scale Assumptions