Skip to content

Fix WebGPU device destroyed on session release, breaking session recreation#27634

Open
nico-martin wants to merge 4 commits intomicrosoft:mainfrom
nico-martin:fix/create-webgpu-device-after-release
Open

Fix WebGPU device destroyed on session release, breaking session recreation#27634
nico-martin wants to merge 4 commits intomicrosoft:mainfrom
nico-martin:fix/create-webgpu-device-after-release

Conversation

@nico-martin
Copy link

Description

We had a weird behavior in Transformers.js V4. After calling InferenceSession.release() on a WebGPU session, attempting to create a new WebGPU session fails with:

WebGPU device lost (2): Device was destroyed.

In Transformers.js we encourage the use of the create -> release -> create pattern, because we expect the application to run for some time and might use multiple models. So it makes sense to unload models after the job is done.

It seems like this was introduced in e03631ee528, which added the preserveDevice option with a default value of false. When the last session is released and preserveDevice=false, the C++ side destroys the WebGPU device, but the JavaScript reference in env.webgpu.device is never cleared, leaving a stale reference to a destroyed device.

Changes

Clear stale device reference when lost (backend-webgpu.ts)

  1. Made device property configurable: true to allow deletion
  2. Added cleanup logic in dispose() to detect device loss via device.lost promise
  3. When device is lost (destroyed, driver crash, etc.), delete the stale env.webgpu.device reference

This allows subsequent session creation to acquire a fresh device instead of attempting to reuse a lost one.

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants