WebGPU validation error (op_handler_failed) when running qwen3.5-4b-generic-gpu:2 on Intel Arc Graphics

### Description
When running the model `qwen3.5-4b-generic-gpu:2` using `foundry run`, an IPC error occurs with the following WebGPU validation failure. The model fails to generate any response.

### Environment
- **OS:** Windows (PowerShell output shows `PS C:\Users\Fatih`)
- **GPU:** Intel Arc Graphics (device ID 7D55, vendor 8086)
- **Driver version:** 32.0.101.8332 (latest as of now)
- **DirectX:** 12
- **Vulkan:** 1.4.328
- **OpenCL:** 3.0
- **Shaders:** 6.7
- **Dedicated GPU memory:** 128 MB
- **Shared system memory:** 18 GB
- **Foundry version:** 0.10.0+174be11ea7aeacd8d0d67b0ba1daebec615284b1
- **ONNX Runtime GenAI version:** included with Foundry

### Steps to Reproduce
1. Install Microsoft Azure AI Foundry CLI.
2. Run the following command in PowerShell:
   ```powershell
   foundry run qwen3.5-4b-generic-gpu:2
3. Type any prompt (e.g., selamun aleyküm).
4. Observe the error.

Actual Error Log
```
● error: IPC error 'op_handler_failed': Error from chat_completions command: Error:
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: WebGPU validation failed. [Buffer (unlabeled)] usage
(Storage(read-write)|Storage(read-only)) includes writable usage and another usage in the same synchronization scope.
 - While validating compute pass usage.
 - While finishing [CommandEncoder (unlabeled)].

   at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x47
   at Microsoft.ML.OnnxRuntimeGenAI.Generator.AppendTokenSequences(Sequences) + 0x1f
   at Microsoft.Neutron.OpenAI.Provider.OnnxChatGenerator..ctor(OnnxLoadedModel, GeneratorParams, ILogger, Sequences,
NamedTensors) + 0x94
   at Microsoft.Neutron.OpenAI.Provider.OnnxChatGenerator.CreateOnnxChatGenerator(ChatCompletionCreateRequestExtended,
OnnxLoadedModel, AzureFoundryLocalModel, ITelemetry, ILogger) + 0xa94
   at Microsoft.AI.Foundry.Local.ChatClient.<>c__DisplayClass8_0.<HandleStreamRequestAsync>b__0(CancellationToken) +
0x2a
   at Microsoft.Neutron.OpenAI.Provider.ChatCompletions.<HandleStreamRequestAsync>d__3.MoveNext() + 0x234
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.ChatClient.<HandleStreamRequestAsync>d__8.MoveNext() + 0x2cb
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.ChatClient.<HandleStreamRequestAsync>d__8.MoveNext() + 0x446
--- End of stack trace from previous location ---
   at
Microsoft.AI.Foundry.Local.NativeInterop.<>c__DisplayClass13_0.<<ExecuteCommandWithCallbackManaged>b__2>d.MoveNext() +
0x467
--- End of stack trace from previous location ---
   at
Microsoft.AI.Foundry.Local.NativeInterop.<>c__DisplayClass13_0.<<ExecuteCommandWithCallbackManaged>b__2>d.MoveNext() +
0x7d9
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.NativeInterop.<ExecuteWithTracker>d__9.MoveNext() + 0xb8
``` 

### Expected Behavior

The model should run without WebGPU validation errors and generate responses.

Workarounds Found (for maintainers)

- Using CPU-only variant also works (though slower):

  ```powershell
  foundry run qwen3.5-4b-cpu:2

### Additional Context

- The error appears to be related to WebGPU buffer usage flags where the same buffer is marked as both Storage(read-write) and Storage(read-only) within the same synchronization scope. This violates the WebGPU specification.

- The issue might be specific to Intel Arc drivers or their WebGPU implementation when used with ONNX Runtime GenAI.

#### Possible Root Cause
ONNX Runtime GenAI's WebGPU backend may be generating buffers with incompatible usage flags, or the Intel Arc WebGPU driver may be stricter about validation than other implementations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebGPU validation error (op_handler_failed) when running qwen3.5-4b-generic-gpu:2 on Intel Arc Graphics #799

Description

Environment

Steps to Reproduce

Expected Behavior

Additional Context

Possible Root Cause

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

WebGPU validation error (op_handler_failed) when running qwen3.5-4b-generic-gpu:2 on Intel Arc Graphics #799

Description

Description

Environment

Steps to Reproduce

Expected Behavior

Additional Context

Possible Root Cause

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions