from the Khronos website on the thread safety of clSetKernelArg:
All OpenCL API calls are thread-safe except clSetKernelArg, which is safe to call from any host thread, and is safe to call re-entrantly so long as concurrent calls operate on different cl_kernel objects. However, the behavior of the cl_kernel object is undefined if clSetKernelArg is called from multiple host threads on the same cl_kernel object at the same time.
My question is, is there a way to make this behavior defined where kernels can read and write from a single kernel object from multiple threads?
I considered that std::atomic on the object being modified by the kernels would prevent this undefined behavior, but from what I have tried, it results in the kernel's output producing the wrong values. Is there a better way to impliment this/ a known technique for dealing with a case?
It might be useful in a case where the allocated object's size is so large that recreating a new object for every kernel execution costs too much memory, and a shared/overrideable object would be preferred.