You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The implementation attached to this RFC expanded from a narrow CDP/browser-control proposal into a broader first-party web research capability while keeping the model-facing tool surface intentionally small.
Problem statement
TouchAI needs reliable material collection for research, industry reports, competitive analysis, daily-life lookup, technical investigation, and pages that block simple HTTP fetching. A browser-only design is too expensive and fragile for ordinary discovery, while fetch-only access fails on rendered, interactive, blocked, or login-dependent pages.
Proposed solution
Expose three compact first-party capabilities instead of many low-level tools:
web_search: search/discovery across enabled and configured providers.
web_fetch: read and extract known public URLs.
browser: rendered-page/browser control for interaction, screenshots, blocked pages, existing sessions, and verification.
Settings are split into dedicated tabs:
搜索: pluggable provider configuration, default supplier selection, quota/API-key hints, dynamic exposure based on enabled/configured state, and model-facing routing suggestions.
The runtime and prompt boundary should guide the model to use search for discovery, fetch for known URLs, and browser control when rendering, interaction, verification, screenshots, login/session state, or anti-fetch behavior requires it. Major research tasks should first form a plan, prefer official and authoritative sources, use visual evidence when it directly explains the result, and provide reviewable references.
Affected boundaries
AgentService or conversation runtime
tool execution or instruction loading
session persistence or context construction
database schema or migrations
settings UI
native browser runtime
MCP integration
Design details
Search providers are intentionally pluggable so providers such as AnySearch, SearXNG, Semantic Scholar, Brave, Tavily, Exa, Firecrawl, Wikipedia, GitHub, and OpenAlex can be added or removed without reshaping the Settings UI or the model-facing tool contract.
API-key providers must not be exposed as usable until configured; no-key/default providers preserve useful behavior without extra setup.
Browser control uses installed-browser discovery and default-browser resolution so the UI can show the browser that will actually be used while still allowing a custom executable path.
Browser data defaults live under the application data area so profiles are reusable and predictable instead of being hidden in temporary runtime folders.
Existing-session connection is policy-controlled: allow, reject, or ask/select when multiple sessions are available.
Browser permissions support an overall mode plus granular fields, so users can choose always allow, automatic per-action rules, or reject.
Tool calls carry semantic descriptions for review/history display; routine status/current-tab style operations can use fixed concise text, while meaningful browser actions must explain the intent.
Fingerprint simulation/headless settings are best-effort compatibility controls, not a promise to bypass anti-bot systems or site policy.
Alternatives and trade-offs
Many raw provider/browser tools: more flexible, but noisy for the model and harder for users to audit.
Browser-only automation: useful for interactive pages, but too expensive and fragile as the default path for ordinary research discovery.
Paid third-party search only: strong quality for configured users, but conflicts with the goal of useful default behavior without extra setup.
Fetching search-result pages directly: avoids provider integration, but is brittle, anti-bot-prone, and often produces weaker source attribution.
Full anti-detect browser stack: stronger stealth potential, but high maintenance and compliance cost; this RFC keeps to transparent, user-controlled compatibility settings.
Upstream references
The implementation direction was informed by mainstream agent/search/browser patterns including OpenCode/OpenClaw-style search/fetch separation, Browser-use/Patchright/Camoufox-style browser control considerations, and provider-style search services such as Brave Search, Tavily, Exa, Firecrawl, SearXNG, OpenAlex, Semantic Scholar, GitHub, Wikipedia, and AnySearch.
Acceptance criteria
The model sees a small, understandable set of web research tools rather than many low-level provider/browser operations.
Search settings are managed in a dedicated Settings tab with provider enablement, default supplier, quota labels, API-key handling, and dynamic model exposure.
Browser settings are managed in a dedicated Settings tab with browser discovery/defaults, custom executable path, profile data path, default homepage, permission controls, existing-session policy, launch mode, and fingerprint simulation.
Search/fetch/browser prompt guidance covers authoritative sourcing, deep research planning, visual evidence, access restrictions, and escalation to browser control when fetch is insufficient.
Browser tool calls include concise semantic descriptions suitable for approval UI and history display.
Native browser commands and browser tool behavior have automated coverage.
GitHub Checks are green: CI Required, Conventional Commits, Frontend Quality, Frontend Tests, Rust Checks, Desktop E2E Smoke (Windows), E2E Required, CodeQL (javascript-typescript), CodeQL (rust), Site Build, and Validate PR template passed.
Local desktop checks passed for lint, format, typecheck, frontend coverage, Rust formatting, and Rust browser_commands integration tests.
A local check:rust wrapper previously hit a GitHub RTK download timeout, but the equivalent Rust validation completed in CI.
Rollout should keep the capability behind explicit settings/permissions, preserve no-extra-setup search defaults, and treat browser anti-bot handling as best-effort rather than guaranteed bypass.
RFC boundary update for PR #444
The implementation attached to this RFC expanded from a narrow CDP/browser-control proposal into a broader first-party web research capability while keeping the model-facing tool surface intentionally small.
Problem statement
TouchAI needs reliable material collection for research, industry reports, competitive analysis, daily-life lookup, technical investigation, and pages that block simple HTTP fetching. A browser-only design is too expensive and fragile for ordinary discovery, while fetch-only access fails on rendered, interactive, blocked, or login-dependent pages.
Proposed solution
Expose three compact first-party capabilities instead of many low-level tools:
web_search: search/discovery across enabled and configured providers.web_fetch: read and extract known public URLs.browser: rendered-page/browser control for interaction, screenshots, blocked pages, existing sessions, and verification.Settings are split into dedicated tabs:
搜索: pluggable provider configuration, default supplier selection, quota/API-key hints, dynamic exposure based on enabled/configured state, and model-facing routing suggestions.浏览器控制: feature enablement, default browser resolution, custom browser executable, default homepage, browser data directory, permission mode, per-action permissions, allow/block domains, existing-session policy, default/headless mode, and advanced fingerprint simulation.The runtime and prompt boundary should guide the model to use search for discovery, fetch for known URLs, and browser control when rendering, interaction, verification, screenshots, login/session state, or anti-fetch behavior requires it. Major research tasks should first form a plan, prefer official and authoritative sources, use visual evidence when it directly explains the result, and provide reviewable references.
Affected boundaries
Design details
Alternatives and trade-offs
Upstream references
The implementation direction was informed by mainstream agent/search/browser patterns including OpenCode/OpenClaw-style search/fetch separation, Browser-use/Patchright/Camoufox-style browser control considerations, and provider-style search services such as Brave Search, Tavily, Exa, Firecrawl, SearXNG, OpenAlex, Semantic Scholar, GitHub, Wikipedia, and AnySearch.
Acceptance criteria
Testing and rollout
Implementation PR: #444.
Current verification for PR #444:
CI Required,Conventional Commits,Frontend Quality,Frontend Tests,Rust Checks,Desktop E2E Smoke (Windows),E2E Required,CodeQL (javascript-typescript),CodeQL (rust),Site Build, andValidate PR templatepassed.browser_commandsintegration tests.check:rustwrapper previously hit a GitHub RTK download timeout, but the equivalent Rust validation completed in CI.Rollout should keep the capability behind explicit settings/permissions, preserve no-extra-setup search defaults, and treat browser anti-bot handling as best-effort rather than guaranteed bypass.