How rate limiting works in birchline/api
rateLimit() middleware, which resolves the
caller to a bucket key, fetches a token-bucket from Redis, and either consumes one token or
returns 429. Limits are declared per-route in config/limits.yaml; routes
without an entry inherit the default tier (100 req/min per API key).
The request path, step by step
Expand each step to see what runs and where it lives. The whole path is ~40 lines and adds about 0.4 ms p50 to every request.
1 · Identify the caller middleware/ratelimit.ts:21
The middleware first reduces the request to a bucketKey: API key if an
Authorization header is present, otherwise the client IP (via the
x-forwarded-for chain, trusting only our own LB). Anonymous IP traffic gets a much
lower default tier.
2 · Look up the bucket lib/tokenBucket.ts:9
The route name plus bucket key map to a Redis hash (rl:{route}:{key}) holding
tokens and updatedAt. If the key is missing it's created lazily at full
capacity — there's no warm-up.
3 · Refill and consume lib/tokenBucket.ts:31
Refill is computed from elapsed time (rate × Δt, capped at burst), then
one token is subtracted. The whole read-modify-write runs as a single Lua script so concurrent
requests can't double-spend.
4 · Reject when empty middleware/ratelimit.ts:48
If the script returns tokens < 0 the middleware short-circuits with
429 Too Many Requests and sets Retry-After to the seconds until one token
refills. Successful responses always carry X-RateLimit-Remaining.
Configuring a limit on your route
You don't touch the middleware. Add an entry to config/limits.yaml keyed by route name,
and (optionally) tag the route so the middleware can find it.
# config/limits.yaml default: rate: 100/min burst: 120 search.query: rate: 20/min burst: 40 key: api_key # or: ip
// routes/search.ts router.post( "/search", rateLimit("search.query"), handler, );
HTTP/1.1 429 Too Many Requests
Retry-After: 17
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0
{ "error": "rate_limited", "retry_after": 17 }
rateLimit() with no argument. The route name is inferred from the path.Gotchas worth knowing
- Limits are per-process in dev. The Redis client falls back to an
in-memory map when
REDIS_URLis unset, so local testing won't reflect real cluster behaviour. - Burst ≠ rate.
burstis the bucket capacity; a caller idle for a minute can fireburstrequests instantly even ifrateis low. - Streaming responses count once. The token is consumed at request start; a 30-second SSE stream still costs one token.
FAQ
- How do I exempt internal traffic?
- Set
x-birchline-internal: 1from the caller; the middleware checks it against the mTLS peer name and skips the bucket entirely. - Where do I see who's getting limited?
- Every
429emits aratelimit.rejectedmetric tagged with route and key type. There's a Grafana panel under API → Health. - Can a single user have a higher limit?
- Yes — add their API key under
overrides:in the YAML. Overrides are reloaded without a deploy.