read tool inlines PDFs/images as base64 into history → context overflow (HTTP 400) on text-only models

## Summary

The `read` tool inlines whole binary files (PDFs, images) as base64 **data-URIs** directly into the conversation history. On a **text-only** model such as `deepseek-reasoner`, this is both unusable and a fast path to context overflow: each PDF read adds ~1.6 MB ≈ ~400K tokens to the message history, which is then resent every turn (and on session resume), eventually producing a hard HTTP 400:

```
400 This model's maximum context length is 1048565 tokens. However, you requested 5677038 tokens
(5677038 in the messages, 0 in the completion). Please reduce the length of the messages or completion.
```

The 400 fires in `SessionManager.createSession → activateSession → createChatCompletionStream`, because session (re)activation loads the entire stored history into `messages`.

## Environment

- `@vegamo/deepcode-cli` 0.1.30
- Model: `deepseek-reasoner` (`https://api.deepseek.com`), `thinkingEnabled: true`
- Node v23.11

## Steps to reproduce

1. Start a session and `read` a PDF (e.g. an invoice/report).
2. Observe the tool output stored in history is `data:application/pdf;base64,JVBERi0...` (the whole file).
3. Read a few more PDFs (or resume the session). Each adds ~400K tokens.
4. The next request is rejected with the 400 above once the cumulative history exceeds the model's context window.

Real example: a session with **12 records total** but **4 × ~1.6 MB** `read`-tool base64 PDF outputs (~1.6M tokens); larger sessions reached 5.6M tokens.

## Root cause (from the bundled `dist/cli.js`)

- The `read` tool's PDF branch returns `output: \`data:application/pdf;base64,${base64}\`` (metadata `mime: "application/pdf"`).
- Images go through `bufferToDataUrl(buffer, mimeType)` → `data:${mimeType};base64,${...}` for `IMAGE_MIME_BY_EXT` (png/jpg/jpeg/gif/webp).
- There is **no multimodality gating** — base64 is inlined regardless of whether the active model can consume it (`deepseek-reasoner` cannot).
- There is **no per-tool-output size cap** and **no pre-send token-budget check** against the model context limit; the only "truncate" in the bundle is UI rendering (`wrap: "truncate-end"`).

## Suggested fixes

1. **Gate base64 inlining on model multimodality.** For text-only models, never inline binary; for PDFs, extract text (e.g. `pdftotext`/a PDF parser) and pass the text instead.
2. **Cap per-tool-output size** before appending to history (byte/token limit + a "[truncated]" notice), so a single `read` can't add hundreds of thousands of tokens.
3. **Add a pre-send token-budget guard** against the model's max context: estimate request size, keep a margin under the limit, and compact / drop-oldest (or fail soft with a clear message) instead of surfacing a raw provider 400.

## Workaround

Locally patched the `read` PDF branch to return `pdftotext`-extracted text (capped) instead of base64 — output dropped ~40× (a real PDF: ~1.6 MB base64 → ~39 KB text) and PDF reading still works. Happy to share the diff if useful.

---

### 中文摘要

`read` 工具会把 PDF/图片整体以 base64 data-URI 直接写入对话历史。对于纯文本模型（如 `deepseek-reasoner`），每读一个 PDF 约增加 ~40 万 tokens，多次读取或恢复会话后超出上下文上限，报 400（max 1048565，requested 5677038）。根因：`read` 的 PDF 分支返回 `data:application/pdf;base64,...`，图片走 `bufferToDataUrl`，且没有按模型多模态能力做判断、没有单次输出截断、没有发送前 token 预算检查。建议：按模型能力决定是否内联；PDF 改为抽取文本（如 pdftotext）；对工具输出做大小上限；发送前做 token 预算并优雅降级而非直接报 400。


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read tool inlines PDFs/images as base64 into history → context overflow (HTTP 400) on text-only models #181

Summary

Environment

Steps to reproduce

Root cause (from the bundled `dist/cli.js`)

Suggested fixes

Workaround

中文摘要

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

read tool inlines PDFs/images as base64 into history → context overflow (HTTP 400) on text-only models #181

Description

Summary

Environment

Steps to reproduce

Root cause (from the bundled dist/cli.js)

Suggested fixes

Workaround

中文摘要

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Root cause (from the bundled `dist/cli.js`)