TokenRedux A small web app that compresses LLM prompts to reduce input-token usage, and shows you the before/after token count and estimated cost difference.
Status: early MVP / learning project. This is not a finished product. It compresses pasted prompts via an API call and returns a shortened version plus a short explanation. Treat the savings numbers as estimates, not guarantees.
What it does
Paste a prompt Sends it to an LLM that rewrites it more concisely while trying to preserve intent Returns the compressed prompt, a before/after token count, and an estimated cost difference Lets you copy the compressed version
What it does not do (yet)
No context/history pruning No model selection or automatic model switching No agent-output compression No guarantee that output quality is preserved — verify before relying on a compressed prompt Token counts are estimates and may not exactly match a given provider's tokenizer
Why LLM API costs scale with tokens. A large share of input tokens in real workflows is redundant (verbose phrasing, repeated instructions). This tool is an experiment in reducing that on the input side, without manually rewriting prompts. How it works
User input is wrapped in delimiters so the model compresses it rather than acting on it The model is prompted to shorten the prompt while preserving meaning, and to keep code/error messages verbatim Output is returned as structured data (compressed prompt + explanation) Token counts are computed and compared
Setup bashgit clone cd tokenredux
Replace the above with the real commands once the stack is finalized.
Limitations & known issues
Compression quality depends entirely on the underlying model and prompt No formal benchmarks yet on how often intent is preserved Verbatim preservation of code/errors is best-effort, not guaranteed Cost estimates assume a specific model's pricing; check the source
MIT