Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using separate cuda streams for one session #23319

Open
cozeybozey opened this issue Jan 10, 2025 · 0 comments
Open

Using separate cuda streams for one session #23319

cozeybozey opened this issue Jan 10, 2025 · 0 comments
Labels
ep:CUDA issues related to the CUDA execution provider

Comments

@cozeybozey
Copy link

Describe the issue

I have multiple threads that are calling session.run on one session. I recently made it so I am using pinned memory and asynchronous mem copies, which is working great. However, to do this I am using separate cuda streams for the mem copies. I noticed that session.run does not work with these cuda streams. I can link one cuda stream to a session via the options, but I want to use multiple cuda streams and a different one for every run call. How can I achieve this? Or should I just use multiple sessions instead? But then I will have multiple instances of the same model in memory, which doesn't seem great.

To reproduce

--

Urgency

No response

Platform

Windows

OS Version

10

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

16.2

ONNX Runtime API

C++

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider
Projects
None yet
Development

No branches or pull requests

1 participant