Problem Statement
Design a rate limiter for a large-scale backend system.
The rate limiter should restrict the number of requests a client can make within a specified time window. The system must operate correctly in a distributed environment where multiple stateless application servers handle traffic.
Example limits:
- 100 requests per minute per user
- 1000 requests per hour per API key
- Different limits per endpoint
Functional Requirements
- Limit requests based on:
- User ID
- API Key
- IP Address
- Support configurable rate limits (e.g., 10 requests/second, 100 requests/minute).
- Support per-endpoint rate limiting.
- Return an appropriate response (e.g., HTTP 429) when the limit is exceeded.
- Allow dynamic updates to rate limit configurations.
Non-Functional Requirements
- High availability.
- Low latency — rate check should not significantly impact request processing time.
- Horizontally scalable.
- Fault tolerant.
- Minimal operational overhead.
Scale Assumptions