throttlekit
About

Held to a proof, not a hope.

Most rate limiters do the easy 10% — count requests on one box — and leave the hard parts to luck: what your fleet actually admits, and what your LLM actually spends. ThrottleKit treats those as things to bound and verify, then ships the bounds as ordinary features.

Why it exists
The gap nobody fills.

A single-process counter is a solved problem — a dozen good libraries do it. But the moment you have more than one box, "the limit" becomes approximate: every node admits its own share and the real total drifts above what you set. And the moment your cost is an LLM completion, request-counting misses the point entirely — the thing you actually pay for is tokens, known only after the stream ends.

Those are the two places ThrottleKit is built for. Not a faster counter — a limiter that can tell you, with a machine-checked bound, that admissions never exceed the limit no matter how many nodes you run; and one that can govern token spend with overshoot independent of the cap. The rest of a complete limiter is here too, but those two are the reason it exists.

How it's built
Four commitments.
  • Machine-checked is a verification technique, not a formality. The load-bearing bounds live in TLA⁺ specs run through TLC, and a dependency-free twin that re-derives the same invariants in CI on every push — treated like exhaustive testing, because that's what it is.
  • One verified core, everywhere. Strategies are pure functions of time, compiled to an atomic Lua form proven bit-identical across six stores. Add a backend by implementing one primitive; adding an algorithm never touches a store. The Python client computes no math — it reaches the same oracle.
  • Safety is decoupled from cleverness. The online learners (lease sizing, token reservation) only trade efficiency; the hard cap is held structurally, so no predictor — however adversarial — can breach it.
  • Honest edges, in writing. Every component doc ends with its caveats and failure behavior. The benchmarks lead with methodology and say plainly where an incumbent wins. Numbers trace to a harness you can re-run.
Status
A stable 1.x core, an open frontier.

The core — strategies, stores, adapters, the two-tier engine — is shipped and stable under SemVer. Pieces on the evolving frontier are marked experimental and excluded from the surface guarantee, so you always know what you're depending on. ThrottleKit is polyglot from one verified core: Node today, Python today via a thin client that reaches the same oracle, more to follow.

Who & license
MIT. Developed in the open.

ThrottleKit is built by Ameya Borkar and released under the MIT license — free to use, inspect, and build on. The design is documented component by component, the proofs are in the repo, and the benchmarks run on your hardware. If you find an edge the docs don't cover, the issues are open.