Rul1an/llm-cost
Offline token counting, pricing and quota-guarding for LLM workloads (OpenAI, Llama, etc.). Single binary, exact tokenizers for o200k_base / cl100k_ba...
Offline token counter and cost estimator for LLM Engineering.
llm-cost is a statically linked CLI tool written in Zig. It replicates OpenAI's tiktoken logic with memory safety and offline capability. Designed for integration into CI/CD pipelines and infrastructure scripts.
tiktoken using edge-case corpora (Unicode, Whitespace).Binaries Stable releases available on GitHub Releases.
Source Requires Zig 0.14.0.
git clone https://github.com/Rul1an/llm-cost
cd llm-cost
zig build -Doptimize=ReleaseFast
cp zig-out/bin/llm-cost /usr/local/bin/
Count Tokens
# Direct input
llm-cost count --model gpt-4o --text "Hello world"
# Pipe from file
cat document.txt | llm-cost count --model gpt-4o
Estimate Cost
llm-cost estimate --model gpt-4o --input-tokens 5000 --output-tokens 200
Analyze Corpus (Compression & Costs)
llm-cost report --model gpt-4o --json my_corpus.txt
# Output: {"stats":{...}, "metrics":{"bytes_per_token":4.2, "tokens_per_word":1.3}}
Pipeline Integration
# Fail if cost exceeds $1.00
cat logs.jsonl | llm-cost pipe --model gpt-4o --max-cost 1.00
Project documentation follows the Diátaxis structure.
| Type | Content |
|---|---|
| Guides | CI Integration, Release Verification |
| Reference | CLI Commands, Benchmarks, Man Page |
| Explanation | Architecture, Security Policy |
| Metric | Result (Apple Silicon) |
|---|---|
| Throughput | ~10.11 MB/s |
| Latency (P99) | ~0.13 ms (Small Inputs) |
| Complexity | O(N) Linear |
See docs/reference/benchmarks.md for methodology.
Builds adhere to SLSA Level 2 standards.
See docs/guides/verification.md for verification steps.
MIT © Rul1an