fix(lmtool): reframe timeout messages as transient + retry-friendly to nudge model retries#1652
Open
wenytang-ms wants to merge 2 commits into
Open
fix(lmtool): reframe timeout messages as transient + retry-friendly to nudge model retries#1652wenytang-ms wants to merge 2 commits into
wenytang-ms wants to merge 2 commits into
Conversation
The two timeout-path response messages previously led with '❌ Debug session failed to start' and listed root-cause hypotheses (compilation errors, ClassNotFoundException, ...) as 'This usually indicates a problem'. Telemetry shows that interpretation is wrong: - 30d retry analysis: success rate climbs from 17.7% (1 invoke) to 64.9% (6-10 invokes) — a 3.7x lift - Started latency P95 = 100s vs the 45s / 15s thresholds — many timeouts are just slow-starting JVMs that DO eventually attach - 66% of users invoke only once, suggesting the model treats the ❌ wording as a permanent failure and gives up Rewrite both timeout messages to: 1. Lead with ⏳ (transient) instead of ❌ /⚠️ (terminal) 2. State explicitly 'this is often transient' 3. Cite the retry-success pattern 4. Put 'call debug_java_application again' as the first recommended action 5. Keep the diagnostic checklist as a fallback (steps 3-4) for genuinely broken cases No behavior change in the extension — only the natural-language string returned to the language model is changed. This is a targeted nudge for the model's retry policy. Companion PR #1650 will give us per-invoke retry telemetry (retryCount / previousOutcome) to verify the lift after this lands. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adjusts the language-model-facing timeout messages in debug_java_application to frame timeouts as often transient and to encourage retry/polling behavior, aligning the tool’s guidance with observed slow-start telemetry patterns.
Changes:
- Replaces terminal-failure timeout wording with transient/retry-friendly messaging for the event-based wait path.
- Rewords the smart-polling timeout message to emphasize polling/retry as the primary next steps.
- Keeps existing timeout thresholds/behavior intact, changing only the natural-language response strings.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| + `Recommended next actions (in order):\n` | ||
| + `1. Call get_debug_session_info() to check whether the session has since become active.\n` | ||
| + `2. Call debug_java_application again — most timeout cases recover on retry. ` | ||
| + `Pass waitForSession=true to extend the wait window for slow-starting apps.\n` |
…g timeout message
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The two timeout-path response strings returned to the language model previously led with terminal-failure framing:
❌ Debug session failed to start within 45 seconds for ...⚠️ Debug command sent ... but session not detected within 15 seconds ...and immediately followed with
This usually indicates a problem:listing compilation errors, ClassNotFoundException, application crashes, etc. as the primary causes.30-day telemetry says that framing is wrong.
debug_java_applicationinvokes with no outcome event within 60sIn other words: a large fraction of "timeouts" are slow-starting JVMs that DO eventually attach, not permanent failures. The model is treating the ❌ wording as a stop signal and abandoning sessions that would have succeeded on a retry or a
get_debug_session_info()poll.Fix
Rewrite both timeout messages to:
"This is often transient".call debug_java_application again/call get_debug_session_info()to the top of the recommended-actions list, with diagnostic steps demoted to fallbacks.Before (eventBased path)
After (eventBased path)
The smartPolling path is reworded analogously.
Scope
Test
npx tsc --noEmit— cleannpm run tslint— cleanValidation plan
After this lands, the per-invoke
retryCount/previousOutcomefields from #1650 will let us measure the lift directly:Target after this PR ships:
retry_after_timeout_pctfor the new extversion should increase materially (current baseline implied ~34% of attempts come after some kind of timeout).Related
debugagent/06-深度诊断-PR定向-2026-06-09.md— P1-2 "66% one-and-done".Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com