fix: driver-aware MCP toolset so MongoDB connections work (#180) by eugene-mobile · Pull Request #181 · appspace/helix

eugene-mobile · 2026-06-19T20:02:07Z

Fixes #180.

Problem

When Helix is connected to MongoDB, the MCP server is unusable for AI agents — no query returns a single row. The tools were hard-coded to be SQL-shaped regardless of the connected driver, and the SQL keyword gate (firstKeyword) and the MQL driver impose mutually-exclusive requirements:

SQL in → passes the keyword gate → driver rejects with "expects a JSON-encoded MQL request"
MQL JSON in → rejected by the keyword gate ("Got: unknown" / "Got: DB.") before reaching the driver

See #180 for the full write-up and an agent transcript hitting the dead-end.

Fix

buildMcpServer() now branches on getDriver().queryMode and registers one of two mutually-exclusive tool sets:

mode	tools
`sql`	`execute_query`, `execute_write`, `list_tables`, `describe_table` (unchanged)
`mql`	reads: `find_documents`, `aggregate_documents`, `count_documents`, `list_collections`, `describe_collection` · writes (gated): `insert_document`, `update_document`, `delete_document`

The MongoDB tools expose MQL fields (filter, projection, sort, pipeline, document, update, …) as typed parameters and assemble the JSON-encoded request the driver already parses — the model never hand-authors MQL JSON. describe_collection also surfaces indexes + the JSON Schema validator via the driver's existing getCollectionInfo().

Agents know SQL vs NoSQL upfront two ways: (1) the wrong tools aren't registered (an agent on Mongo never sees execute_query), and (2) the MCP instructions are mode-aware and name the connected DB type, e.g. "This is a MongoDB (document) database… Do NOT write SQL."

This slots into the existing architecture cleanly: the transport is stateless and buildMcpServer() already runs per request, so tools/list reflects the currently-connected database.

Tests

New server/src/mcp.test.ts (7 tests, via the SDK in-memory transport):

Mongo connection exposes Mongo tools and not execute_query/execute_write
SQL connection exposes SQL tools and not the Mongo tools
mode-aware instructions say MongoDB / don't-write-SQL
find_documents assembles the correct {collection, operation:'find', filter, limit} JSON and passes schema through
insert_document blocked when writes disabled, allowed when enabled
update_document requires a filter or id

All 169 server tests pass; mcp.ts and mcp.test.ts are type-clean and mcp.ts is lint-clean. (3 pre-existing tsc errors in unrelated driver test files are untouched.)

Known follow-ups (out of scope)

_id ObjectId matching in find/count filters (string _id matched literally; relates to existing TODO Create MongoDB DbDriver implementation #107). An id convenience param is exposed on update/delete.
Stateless transport: a client that cached the tool list won't get a tools/list_changed push if the user switches DB type mid-session.

🤖 Generated with Claude Code

The MCP server always registered SQL-shaped tools (execute_query, execute_write, ...) regardless of the connected driver. On a MongoDB connection the SQL keyword gate and the MQL driver imposed mutually exclusive requirements, so no query could return a single row. buildMcpServer() now branches on getDriver().queryMode and registers one of two mutually-exclusive tool sets: - sql -> execute_query / execute_write / list_tables / describe_table - mql -> find_documents / aggregate_documents / count_documents / list_collections / describe_collection (reads) and insert_document / update_document / delete_document (writes, gated by "Allow MCP to modify data") The Mongo tools expose MQL fields as typed params and assemble the JSON-encoded request the driver already parses, so the model never hand-authors MQL JSON. Server instructions are now mode-aware and name the connected DB type, so the agent knows upfront whether to use SQL or MongoDB tools and never attempts the wrong query language. Adds server/src/mcp.test.ts covering toolset selection per mode, mode-aware instructions, MQL request assembly, and write gating. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

eugene-mobile · 2026-06-19T20:03:12Z

@claude review this PR

claude · 2026-06-19T20:03:27Z

Claude finished @eugene-mobile's task in 2m 23s —— View job

PR Review: fix/mcp-mongodb-tools

Gather context and understand request
Read CLAUDE.md for repo guidelines
Read server/src/mcp.ts changes
Read server/src/mcp.test.ts (new file)
Review git diff and driver interface
Post comprehensive review feedback

Summary

This is a well-designed fix for a real, clearly-documented bug. The architecture is clean: buildMcpServer() branches on queryMode, registers mutually-exclusive tool sets, and the mode-aware instructions string is a nice quality-of-life improvement for agents. The approach of assembling MQL JSON in the tool layer (rather than asking the model to hand-author it) is exactly right.

Below are the findings, ordered by severity.

Medium — Aggregate limit is applied client-side (memory footprint)

File: server/src/mcp.ts:350-353

find_documents passes limit: cap into the MQL request, so the MongoDB driver applies it at the cursor level (server-side, via cursor.limit()). aggregate_documents does not — it calls .toArray() on the full cursor and then slices in the MCP layer:

// find_documents — server-side limit ✓
const result = await runMql(
  { collection, operation: 'find', ..., limit: cap, skip },
  schema,
);

// aggregate_documents — all rows fetched, THEN sliced ✗
const result = await runMql({ collection, operation: 'aggregate', pipeline }, schema);
const totalRows = result.rows.length;
const capped = result.rows.slice(0, cap);

The tool description says "For large pipelines, add an explicit {"$limit": N} stage", but that's guidance to the model, not an actual server-side safeguard. A naive aggregate_documents call on a large collection will buffer up to MAX_ROW_LIMIT (10 000) rows in the Node process regardless of the limit param.

Suggested fix: append a $limit stage to the pipeline before dispatching:

const limitedPipeline = [...pipeline, { $limit: cap }];
const result = await runMql({ collection, operation: 'aggregate', pipeline: limitedPipeline }, schema);

This makes the behavior consistent with find_documents and eliminates the memory footprint.

Fix this →

Medium — Empty `filter: {}` bypasses the guard on `update_document` / `delete_document`

File: server/src/mcp.ts:449, 487

The guard checks:

if (filter === undefined && id === undefined) {
  return toolError('update_document requires either "filter" or "id"...');
}

An AI agent passing filter: {} (an empty object, which is a valid MongoDB "match all" filter) will pass this check. updateOne({}, ...) and deleteOne({}) then silently target an arbitrary document. While writes are gated behind isMcpWritesAllowed(), this remains a foot-gun for agents that mis-generate a filter.

Suggested addition (inside the existing guard block):

if (filter !== undefined && Object.keys(filter).length === 0 && id === undefined) {
  return toolError('update_document: an empty filter {} would match any document — provide at least one field to narrow the selection, or use "id".');
}

Minor — Missing test: `delete_document` requires filter-or-id

File: server/src/mcp.test.ts

There is a test for update_document requires a filter or id (line 141) but no corresponding test for delete_document. The guard is the same pattern and an easy addition:

it('delete_document requires a filter or id', async () => {
  vi.mocked(isMcpWritesAllowed).mockReturnValue(true);
  const query = vi.fn();
  mockMongoDriver(query);
  const { client } = await connectClient();

  const res: any = await client.callTool({
    name: 'delete_document',
    arguments: { collection: 'users' },
  });

  expect(res.isError).toBe(true);
  expect(res.content[0].text).toMatch(/requires either "filter" or "id"/);
  expect(query).not.toHaveBeenCalled();
});

Fix this →

Minor — `buildInstructions` joins sentences with a bare space

File: server/src/mcp.ts:536

return base.join(' ');

This concatenates sentences like "...not connected yet. This is a MongoDB..." without a newline or double-space. The resulting string is grammatically fine but could be split on '. ' to be more readable in MCP client UIs that render instructions as markdown. Not blocking.

Nit — `describe_collection` has no test

The tool exercises the optional getCollectionInfo path and falls back gracefully to null. A quick test that verifies the indexes and validator fields appear in the output (or that it falls back when getCollectionInfo is absent) would round out coverage. Not blocking for merge.

What's done well

Driver-aware branching in buildMcpServer() is minimal and correct. Defaulting to 'sql' when disconnected is a safe, backward-compatible choice.
Mode-aware instructions (naming the DB type, the explicit "Do NOT write SQL" guidance) are high signal for agents and don't rely on the agent reading tool names.
docSchema = z.record(z.string(), z.unknown()) is the right type for free-form Mongo documents; it avoids over-constraining filter/projection/sort shapes.
describe_collection surfaces indexes and JSON Schema validators — useful for agents that need to understand document shape before querying.
Test setup (InMemoryTransport + buildMcpServer()) is clean and avoids heavy mocking. The existing 7 tests cover the key paths.
isMcpWritesAllowed gate is consistently applied to all three write tools.
Known follow-ups are documented in the PR description rather than left as silent debt.

eugene-mobile · 2026-06-19T20:11:07Z

Follow-up: handle DB connect/switch mid-session

Addressing review feedback — the original per-mode registration had a real gap, not a minor one: an agent that connects its MCP client before a database is selected (the common case for a troubleshooting session) caches tools/list. The transport is stateless (sessionIdGenerator: undefined), so the server can't push tools/list_changed. After the user then connects MongoDB, the agent still holds the SQL tools, and calling execute_query hits a server that no longer registers it → a cryptic "tool not found" with no hint why.

Fix: invariant tool list + per-call mode guards

The tool list is now the same regardless of connected DB, so there is no stale cache to invalidate:

All tools always registered — both families plus a new connection_info tool. tools/list never changes with DB type.
Per-call mode guard (requireMode) — a tool from the wrong family returns an actionable error ("This MCP connection is a MongoDB database… use find_documents/…; call connection_info / re-list / reconnect") instead of "tool not found". The agent self-corrects.
connection_info — always available; returns the connected DB type, query language, write state, and recommended tools, so the agent orients up front and can detect a mid-session switch.
Mode-aware instructions still name the currently-connected DB and point at connection_info.

This is client-agnostic (doesn't depend on a client honoring tools/list_changed) and self-correcting. mcp.test.ts now asserts the invariant toolset, connection_info output, and that cross-family calls return actionable mismatch errors. All 171 server tests pass.

eugene-mobile closed this Jun 19, 2026

eugene-mobile mentioned this pull request Jun 19, 2026

fix: MongoDB support for the MCP server (driver-aware tools + connection awareness) (#180) #182

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: driver-aware MCP toolset so MongoDB connections work (#180)#181

fix: driver-aware MCP toolset so MongoDB connections work (#180)#181
eugene-mobile wants to merge 1 commit into
mainfrom
fix/mcp-mongodb-tools

eugene-mobile commented Jun 19, 2026

Uh oh!

eugene-mobile commented Jun 19, 2026

Uh oh!

claude Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

eugene-mobile commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

eugene-mobile commented Jun 19, 2026

Problem

Fix

Tests

Known follow-ups (out of scope)

Uh oh!

eugene-mobile commented Jun 19, 2026

Uh oh!

claude Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix/mcp-mongodb-tools

Summary

Medium — Aggregate limit is applied client-side (memory footprint)

Medium — Empty filter: {} bypasses the guard on update_document / delete_document

Minor — Missing test: delete_document requires filter-or-id

Minor — buildInstructions joins sentences with a bare space

Nit — describe_collection has no test

What's done well

Uh oh!

eugene-mobile commented Jun 19, 2026

Follow-up: handle DB connect/switch mid-session

Fix: invariant tool list + per-call mode guards

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 19, 2026 •

edited

Loading

Medium — Empty `filter: {}` bypasses the guard on `update_document` / `delete_document`

Minor — Missing test: `delete_document` requires filter-or-id

Minor — `buildInstructions` joins sentences with a bare space

Nit — `describe_collection` has no test