fix(lmtool): reframe timeout messages as transient + retry-friendly to nudge model retries by wenytang-ms · Pull Request #1652 · microsoft/vscode-java-debug

wenytang-ms · 2026-06-09T07:01:06Z

Problem

The two timeout-path response strings returned to the language model previously led with terminal-failure framing:

❌ Debug session failed to start within 45 seconds for ...
⚠️ Debug command sent ... but session not detected within 15 seconds ...

and immediately followed with This usually indicates a problem: listing compilation errors, ClassNotFoundException, application crashes, etc. as the primary causes.

30-day telemetry says that framing is wrong.

Signal	Data
Started latency P95	100 s — well above the 45 s / 15 s thresholds
Retry → success rate (1 invoke → 6–10 invokes)	17.7% → 64.9% (3.7x lift)
One-and-done users	66% — model rarely retries after the first ❌
`debug_java_application` invokes with no outcome event within 60s	32.8% — many are still launching

In other words: a large fraction of "timeouts" are slow-starting JVMs that DO eventually attach, not permanent failures. The model is treating the ❌ wording as a stop signal and abandoning sessions that would have succeeded on a retry or a get_debug_session_info() poll.

Fix

Rewrite both timeout messages to:

Lead with ⏳ (transient) instead of ❌ / ⚠️ (terminal).
State explicitly "This is often transient".
Cite the retry pattern so the model treats retry as the expected next step.
Move call debug_java_application again / call get_debug_session_info() to the top of the recommended-actions list, with diagnostic steps demoted to fallbacks.
Preserve the diagnostic checklist (compilation errors, ClassNotFoundException, classpath) as steps 3–4 — they still matter for genuinely broken cases.

Before (eventBased path)

❌ Debug session failed to start within 45 seconds for App.

This usually indicates a problem:
• Compilation errors preventing startup
• ClassNotFoundException or NoClassDefFoundError
• Application crashed during initialization
• Incorrect main class or classpath configuration

Action required:
1. Check terminal 'Java Debug' for error messages
2. Verify the target class name is correct
3. Ensure the project is compiled successfully
4. Use get_debug_session_info() to confirm session status

After (eventBased path)

⏳ Debug session not yet detected for App after 45 seconds.

This is often transient — the JVM may still be starting up (large projects, cold
class-loading, or remote workspaces can need additional time). Telemetry shows
that retrying a timed-out launch succeeds for the majority of cases.

Recommended next actions (in order):
1. Call debug_java_application again — most timeout cases recover on retry.
2. Call get_debug_session_info() to check whether the session has since become
   active.
3. If retrying still times out, inspect terminal 'Java Debug' for compilation
   errors, ClassNotFoundException, NoClassDefFoundError, or other startup
   failures.
4. Verify the target class name and classpath are correct, then retry.

The smartPolling path is reworded analogously.

Scope

No behavior change in the extension. Status codes, return types, telemetry events, and timeout thresholds are unchanged.
Only the natural-language string returned to the model is modified.
This is a deliberate, targeted nudge for the model's retry policy.

Test

npx tsc --noEmit — clean
npm run tslint — clean
Diff: +26 / -20 in 1 file

Validation plan

After this lands, the per-invoke retryCount / previousOutcome fields from #1650 will let us measure the lift directly:

RawEventsVSCodeExt
| where ServerTimestamp > ago(14d)
| where ExtensionName == "vscjava.vscode-java-debug"
| extend op = tostring(Properties["operationname"])
| where op endswith "debug_java_application.invoke"
| extend retryCount = toint(Properties["retrycount"]),
         previousOutcome = tostring(Properties["previousoutcome"])
| summarize invokes = count(),
            after_timeout = countif(previousOutcome == "timeout")
            by extversion = tostring(Properties["common.extversion"])
| extend retry_after_timeout_pct = round(100.0 * after_timeout / invokes, 1)

Target after this PR ships: retry_after_timeout_pct for the new extversion should increase materially (current baseline implied ~34% of attempts come after some kind of timeout).

Independent of perf(telemetry): unblock per-tool quality analysis (GDPR annotations + diagnostic fields + InvocationGuard) #1650 (telemetry contract) and fix(lmtool): stop double-counting classNameDetection on every invoke #1651 (classNameDetection dedupe). All three can land in any order.
Same diagnostic context: debugagent/06-深度诊断-PR定向-2026-06-09.md — P1-2 "66% one-and-done".

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The two timeout-path response messages previously led with '❌ Debug session failed to start' and listed root-cause hypotheses (compilation errors, ClassNotFoundException, ...) as 'This usually indicates a problem'. Telemetry shows that interpretation is wrong: - 30d retry analysis: success rate climbs from 17.7% (1 invoke) to 64.9% (6-10 invokes) — a 3.7x lift - Started latency P95 = 100s vs the 45s / 15s thresholds — many timeouts are just slow-starting JVMs that DO eventually attach - 66% of users invoke only once, suggesting the model treats the ❌ wording as a permanent failure and gives up Rewrite both timeout messages to: 1. Lead with ⏳ (transient) instead of ❌ / ⚠️ (terminal) 2. State explicitly 'this is often transient' 3. Cite the retry-success pattern 4. Put 'call debug_java_application again' as the first recommended action 5. Keep the diagnostic checklist as a fallback (steps 3-4) for genuinely broken cases No behavior change in the extension — only the natural-language string returned to the language model is changed. This is a targeted nudge for the model's retry policy. Companion PR #1650 will give us per-invoke retry telemetry (retryCount / previousOutcome) to verify the lift after this lands. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR adjusts the language-model-facing timeout messages in debug_java_application to frame timeouts as often transient and to encourage retry/polling behavior, aligning the tool’s guidance with observed slow-start telemetry patterns.

Changes:

Replaces terminal-failure timeout wording with transient/retry-friendly messaging for the event-based wait path.
Rewords the smart-polling timeout message to emphasize polling/retry as the primary next steps.
Keeps existing timeout thresholds/behavior intact, changing only the natural-language response strings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+                   + `Recommended next actions (in order):\n`
+                   + `1. Call get_debug_session_info() to check whether the session has since become active.\n`
+                   + `2. Call debug_java_application again — most timeout cases recover on retry. `
+                   + `Pass waitForSession=true to extend the wait window for slow-starting apps.\n`


…g timeout message

wenytang-ms requested review from chagong, jdneo and testforstephen as code owners June 9, 2026 07:01

wenytang-ms requested a review from Copilot June 9, 2026 07:57

Copilot started reviewing on behalf of wenytang-ms June 9, 2026 07:57 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

review: clarify JSON object syntax for waitForSession in smart-pollin…

71ce127

…g timeout message

wenytang-ms mentioned this pull request Jun 9, 2026

fix(lmtool): stop double-counting classNameDetection on every invoke #1651

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(lmtool): reframe timeout messages as transient + retry-friendly to nudge model retries#1652

fix(lmtool): reframe timeout messages as transient + retry-friendly to nudge model retries#1652
wenytang-ms wants to merge 2 commits into
mainfrom
fix/lmtool-timeout-message-retry-friendly

wenytang-ms commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wenytang-ms commented Jun 9, 2026

Problem

Fix

Before (eventBased path)

After (eventBased path)

Scope

Test

Validation plan

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants