You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GPU inference is currently unreachable in iscc-sci for two independent reasons — both need fixing before [gpu] can work:
Packaging: the gpu extra adds onnxruntime-gpu on top of the unconditional base dependency onnxruntime (pyproject: base dep + gpu = ["onnxruntime-gpu"]). Both wheels ship the same onnxruntime package directory and collide in site-packages; in practice the CPU wheel wins and CUDA is silently unavailable. Identical flaw and fix options as [gpu] extra is a no-op: base onnxruntime dependency shadows onnxruntime-gpu (silent CPU fallback) iscc-sct#23.
Session creation never selects CUDA:model() in iscc_sci/code_semantic_image.py creates the session without a providers argument:
_model=rt.InferenceSession(model_path)
Verified with onnxruntime-gpu 1.26.0 correctly installed on a CUDA-capable machine (RTX 3090 Ti, CUDA 12.9, cuDNN 9.17): the call succeeds but the resulting session reports providers=['CPUExecutionProvider'] — CUDA is simply not selected. So even with the packaging fixed, iscc-sci would still run on CPU.
Not benchmarked for iscc-sci specifically, but the analogous fix in iscc-sct (transformer ONNX model, CUDA EP, RTX 3090 Ti) cut embedding time by 16.8x with bit-identical output, which is indicative for the vision transformer used here.
Environment
Windows 10, Python 3.13, iscc-sci 0.2.0 (also verified on current main), onnxruntime/onnxruntime-gpu 1.26.0.
Summary
GPU inference is currently unreachable in iscc-sci for two independent reasons — both need fixing before
[gpu]can work:Packaging: the
gpuextra addsonnxruntime-gpuon top of the unconditional base dependencyonnxruntime(pyproject: base dep +gpu = ["onnxruntime-gpu"]). Both wheels ship the sameonnxruntimepackage directory and collide in site-packages; in practice the CPU wheel wins and CUDA is silently unavailable. Identical flaw and fix options as[gpu]extra is a no-op: baseonnxruntimedependency shadowsonnxruntime-gpu(silent CPU fallback) iscc-sct#23.Session creation never selects CUDA:
model()iniscc_sci/code_semantic_image.pycreates the session without aprovidersargument:Verified with onnxruntime-gpu 1.26.0 correctly installed on a CUDA-capable machine (RTX 3090 Ti, CUDA 12.9, cuDNN 9.17): the call succeeds but the resulting session reports
providers=['CPUExecutionProvider']— CUDA is simply not selected. So even with the packaging fixed, iscc-sci would still run on CPU.Suggested fix
Mirror iscc-sct's provider selection in
model():(Optionally set
SessionOptions.graph_optimization_level = ORT_ENABLE_ALLfor parity with iscc-sct.)Apply the same packaging fix as decided for
[gpu]extra is a no-op: baseonnxruntimedependency shadowsonnxruntime-gpu(silent CPU fallback) iscc-sct#23 (exclusive[cpu]/[gpu]extras, or drop the[gpu]extra and document the wheel-swap workaround).Expected impact
Not benchmarked for iscc-sci specifically, but the analogous fix in iscc-sct (transformer ONNX model, CUDA EP, RTX 3090 Ti) cut embedding time by 16.8x with bit-identical output, which is indicative for the vision transformer used here.
Environment
Windows 10, Python 3.13, iscc-sci 0.2.0 (also verified on current main), onnxruntime/onnxruntime-gpu 1.26.0.