Research & Learning · feature summary

How rate limiting works in `birchline/api`

TL;DR — Every request passes through rateLimit() middleware, which resolves the caller to a bucket key, fetches a token-bucket from Redis, and either consumes one token or returns 429. Limits are declared per-route in config/limits.yaml; routes without an entry inherit the default tier (100 req/min per API key).

The request path, step by step

Expand each step to see what runs and where it lives. The whole path is ~40 lines and adds about 0.4 ms p50 to every request.

1 · Identify the caller middleware/ratelimit.ts:21

The middleware first reduces the request to a bucketKey: API key if an Authorization header is present, otherwise the client IP (via the x-forwarded-for chain, trusting only our own LB). Anonymous IP traffic gets a much lower default tier.

2 · Look up the bucket lib/tokenBucket.ts:9

The route name plus bucket key map to a Redis hash (rl:{route}:{key}) holding tokens and updatedAt. If the key is missing it's created lazily at full capacity — there's no warm-up.

3 · Refill and consume lib/tokenBucket.ts:31

Refill is computed from elapsed time (rate × Δt, capped at burst), then one token is subtracted. The whole read-modify-write runs as a single Lua script so concurrent requests can't double-spend.

4 · Reject when empty middleware/ratelimit.ts:48

If the script returns tokens < 0 the middleware short-circuits with 429 Too Many Requests and sets Retry-After to the seconds until one token refills. Successful responses always carry X-RateLimit-Remaining.

Configuring a limit on your route

You don't touch the middleware. Add an entry to config/limits.yaml keyed by route name, and (optionally) tag the route so the middleware can find it.

# config/limits.yaml
default:
  rate: 100/min
  burst: 120

search.query:
  rate: 20/min
  burst: 40
  key: api_key        # or: ip

// routes/search.ts
router.post(
  "/search",
  rateLimit("search.query"),
  handler,
);

HTTP/1.1 429 Too Many Requests
Retry-After: 17
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0

{ "error": "rate_limited", "retry_after": 17 }

★

If you only need the default tier, you don't need a YAML entry at all — just wrap the handler in rateLimit() with no argument. The route name is inferred from the path.

Gotchas worth knowing

Limits are per-process in dev. The Redis client falls back to an in-memory map when REDIS_URL is unset, so local testing won't reflect real cluster behaviour.
Burst ≠ rate. burst is the bucket capacity; a caller idle for a minute can fire burst requests instantly even if rate is low.
Streaming responses count once. The token is consumed at request start; a 30-second SSE stream still costs one token.

FAQ

How do I exempt internal traffic?: Set x-birchline-internal: 1 from the caller; the middleware checks it against the mTLS peer name and skips the bucket entirely.
Where do I see who's getting limited?: Every 429 emits a ratelimit.rejected metric tagged with route and key type. There's a Grafana panel under API → Health.
Can a single user have a higher limit?: Yes — add their API key under overrides: in the YAML. Overrides are reloaded without a deploy.