MCP Apps: interactive Vega-Lite chart widget#133
Conversation
- Vite-built widget bundles ext-apps SDK + Vega-Lite + vega-interpreter into a single HTML file (no CDN deps, no eval, CSP-safe) - Widget receives tool result via MCP Apps protocol (ontoolresult) - Supports fullscreen toggle when host advertises it - create_chart returns vega_spec only (no PNG, widget renders interactively) - Register ui://sidemantic/chart resource with proper MCP Apps metadata - Remove _apps_enabled flag (MCP Apps works via protocol, not a runtime flag) - Remove vendored mcp-ui-server (widget is self-contained) - Remove --apps CLI flag dependency on mcp-ui-server import - structured_output=False on all tools (fixes Claude Desktop tool visibility) - Replace | None params with falsy defaults (removes anyOf from schemas)
- Move expand button to top-right inline with title ("Expand ↗")
- Fix fullscreen height to fill viewport minus 150px for prompt box
- Add padding in fullscreen mode (16px top, 24px sides)
- Add 5px height buffer to prevent inline scroll
- Dark mode support for expand button
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 606d450029
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f894c5edc7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Generation counter ensures out-of-order embed promises discard stale results - cleanupChart() called before error/loading content to prevent leaked observers
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d4786aca1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4d8fe76d1d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- New explore_metrics tool launches dashboard with metric series, totals, and dimension leaderboards via ui://sidemantic/explorer - App-only widget_query tool (visibility: ["app"]) handles refresh calls from the widget without LLM round-trips - WidgetModel adapter bridges anywidget's model interface to ext-apps SDK, translating set/save_changes into callServerTool requests - Data transported as base64 Arrow IPC for efficient typed transfer - Multi-entry Vite build for chart (960KB) and explorer (336KB) widgets - Add pyarrow to apps optional dependency
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e28033d1b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # --- MCP Resource: Catalog Metadata --- | ||
| # --- Explorer state --- | ||
|
|
||
| _explorer_state: dict | None = None |
There was a problem hiding this comment.
Scope explorer state per MCP session
The explorer backend keeps all widget context in a single module-global _explorer_state, which is overwritten on every explore_metrics call and then reused by widget_query. In HTTP/app mode, concurrent conversations or users can interleave requests, so one session can read another session’s model, filters, and date range, producing cross-session data leakage and incorrect query results. Store state by session/conversation/tool-call identifier instead of a singleton.
Useful? React with 👍 / 👎.
| // Extract data from tool result | ||
| const data = this.extractData(result); | ||
| if (data) { | ||
| this.applyData(data); |
There was a problem hiding this comment.
Ignore stale widget_query responses in callRefresh
After awaiting callServerTool, the handler unconditionally applies returned data with no request-generation guard. If users trigger multiple refreshes quickly (e.g., changing time grain then filters), a slower earlier response can arrive last and overwrite the model with stale data that no longer matches current selections. Track a refresh token/counter and discard out-of-order responses before applyData.
Useful? React with 👍 / 👎.
- explore_metrics returns config only (instant), no data queries - Widget renders skeletons immediately, then fetches data via widget_query - widget_query restructured to query one metric or one dimension per call, preventing timeout from sequential batch of 33+ queries - Fix ORDER BY using raw timestamp instead of grain-suffixed column - Improve time_series_column detection: grain suffix > Arrow type > name match > first col - Cap auto-discovered metrics and dimensions at 8 each - Skip full-table date range scan when start_date/end_date provided - Remove outer try/except in widget_query so partial results still return - Expand button as top bar instead of absolute overlay - Add stderr logging for debugging MCP tool calls
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 85b72785d6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| this._metricSeriesMap[metricKey] = data.metric_series_data; | ||
| // Use the first available metric's series as metric_series_data | ||
| // (widget.js will parse it and render sparklines) | ||
| const firstKey = Object.keys(this._metricSeriesMap)[0]; | ||
| if (firstKey) { |
There was a problem hiding this comment.
Merge metric series tables before publishing metric_series_data
fetchMetric stores each metric's Arrow payload in _metricSeriesMap but then publishes only the first map entry as metric_series_data; the widget renderer reads all sparklines from this single table by metric key, so any metric column not present is interpreted as 0 and shows incorrect charts (and the “first” metric depends on response timing). This breaks multi-metric explorer views whenever more than one metric is configured, so the per-metric payloads need to be merged into one table (or otherwise exposed per metric in a way widget.js can consume).
Useful? React with 👍 / 👎.
# Conflicts: # pyproject.toml # sidemantic/apps/__init__.py # tests/test_mcp_apps.py # uv.lock
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb5075c041
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| filter_exprs.append(f"{model_name}.{dim_key} IS NULL") | ||
| else: | ||
| safe = _escape_sql_literal(str(values[0])) | ||
| filter_exprs.append(f"{model_name}.{dim_key} = '{safe}'") |
There was a problem hiding this comment.
Preserve dimension refs when applying filters
For explorer dimensions supplied from a related model, the leaderboard query uses the stored dc["ref"], but filter clicks only send the display key and this builder prefixes it with model_name. For example model_name="orders" with dimensions=["customers.region"] generates filters on orders.region instead of customers.region, so cross-filtering either fails compilation or filters the wrong field. Keep a key-to-ref mapping in state and build predicates from the original dimension ref.
Useful? React with 👍 / 👎.
The dimension branch of widget_query reconstructed the selected metric as model_name.selected_metric, which rejected metrics whose configured ref points at a related model and left leaderboards blank. Look the key up in metrics_config and use its ref, mirroring the metric branch.
There was a problem hiding this comment.
💡 Codex Review
https://github.com/sidequery/sidemantic/blob/d554c770b47ff8160e4bbc453a498285f767f194/apps/web/explorer-app.ts#L306-L309
Reset explorer state for each tool result
When the same app instance receives another explore_metrics result, this only overwrites fields present in the new payload; client-only state such as filters, brush_selection, and _metricSeriesMap can remain from the previous model. Because callRefreshPublic("all") runs immediately afterward, those stale filters are sent to widget_query and are rebuilt against the new model, which can produce invalid predicates or silently misfiltered/blank dashboards after a user starts a second exploration in the same widget.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| metric_name = metric_ref.split(".")[-1] | ||
| metrics_config.append( | ||
| { | ||
| "key": metric_name, |
There was a problem hiding this comment.
Make explorer metric keys unique
When callers pass explicit metric refs from multiple models that share the same terminal name, e.g. orders.count and customers.count, both configs get key: "count". The UI and widget_query address metrics by this key (next(... if m["key"] == metric_key) and totals/series maps are also keyed by it), so the second metric cannot be queried or rendered independently and will be conflated with the first. Use a stable unique key derived from the full ref or de-duplicate keys.
Useful? React with 👍 / 👎.
Summary
@modelcontextprotocol/ext-appsSDK