Skip to content

Add runtime provider abstraction for optimized execution #27

@Tinnci

Description

@Tinnci

Add a runtime-provider layer that sits below model backends and above device detection.

Candidate providers:

  • PyTorch eager / torch.compile where useful.
  • ONNX Runtime CPU, CUDA, ROCm, DirectML, OpenVINO providers.
  • TensorRT or OpenVINO SDK adapters where model export and licensing allow it.
  • Intel XPU / IPEX route if it can be tested without contaminating default installs.

Rules:

  • Backends request a runtime capability; they do not own provider installation or global detection.
  • Provider code must be optional and import-light until selected.
  • Provider selection must be explicit or driven by documented auto policy.

Acceptance criteria:

  • Proposed API for runtime provider selection is documented.
  • At least one provider adapter has tests with mocked availability.
  • Backend code remains provider-agnostic unless a model genuinely needs a provider-specific graph.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions