[−][src]Module rustacuda::context
CUDA context management
Most CUDA functions require a context. A CUDA context is analogous to a CPU process - it's an isolated container for all runtime state, including configuration settings and the device/unified/page-locked memory allocations. Each context has a separate memory space, and pointers from one context do not work in another. Each context is associated with a single device. Although it is possible to have multiple contexts associated with a single device, this is strongly discouraged as it can cause a significant loss of performance.
CUDA keeps a thread-local stack of contexts which the programmer can push to or pop from. The top context in that stack is known as the "current" context and it is used in most CUDA API calls. One context can be safely made current in multiple CPU threads.
Safety
The CUDA context management API does not fit easily into Rust's safety guarantees.
The thread-local stack (as well as the fact that any context can be on the stack for any number of threads) means there is no clear owner for a CUDA context, but it still has to be cleaned up. Also, the fact that a context can be current to multiple threads at once means that there can be multiple implicit references to a context which are not controlled by Rust.
RustaCUDA handles ownership by providing an owning Context
struct and
a non-owning UnownedContext
. When the Context
is dropped, the
backing context is destroyed. The context could be current on other threads, though. In this
case, the context is still destroyed, and attempts to access the context on other threads will
fail with an error. This is (mostly) safe, if a bit inconvenient. It's only mostly safe because
other threads could be accessing that context while the destructor is running on this thread,
which could result in undefined behavior.
In short, Rust's thread-safety guarantees cannot fully protect use of the context management
functions. The programmer must ensure that no other OS threads are using the Context
when it
is dropped.
Examples
For most commmon uses (one device, one OS thread) it should suffice to create a single context:
use rustacuda::device::Device; use rustacuda::context::{Context, ContextFlags}; rustacuda::init(rustacuda::CudaFlags::empty())?; let device = Device::get_device(0)?; let context = Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?; // call RustaCUDA functions which use the context // The context will be destroyed when dropped or it falls out of scope. drop(context);
If you have multiple OS threads that each submit work to the same device, you can get a handle to the single context and pass it to each thread.
// As before let context = Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?; let mut join_handles = vec![]; for _ in 0..4 { let unowned = context.get_unowned(); let join_handle = std::thread::spawn(move || { CurrentContext::set_current(&unowned).unwrap(); // Call RustaCUDA functions which use the context }); join_handles.push(join_handle); } // We must ensure that the other threads are not using the context when it's destroyed. for handle in join_handles { handle.join().unwrap(); } // Now it's safe to drop the context. drop(context);
If you have multiple devices, each device needs its own context.
// Create and pop contexts for each device let mut contexts = vec![]; for device in Device::devices()? { let device = device?; let ctx = Context::create_and_push(ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?; ContextStack::pop()?; contexts.push(ctx); } CurrentContext::set_current(&contexts[0])?; // Call RustaCUDA functions which will use the context
Structs
Context | Owned handle to a CUDA context. |
ContextFlags | Bit flags for initializing the CUDA context. |
ContextStack | Type used to represent the thread-local context stack. |
CurrentContext | Type representing the top context in the thread-local stack. |
StreamPriorityRange | Struct representing a range of stream priorities. |
UnownedContext | Non-owning handle to a CUDA context. |
Enums
CacheConfig | This enumeration represents configuration settings for devices which share hardware resources between L1 cache and shared memory. |
ResourceLimit | This enumeration represents the limited resources which can be accessed through CurrentContext::get_resource_limit and CurrentContext::set_resource_limit. |
SharedMemoryConfig | This enumeration represents the options for configuring the shared memory bank size. |
Traits
ContextHandle | Sealed trait for |