Hello -
I have set a challenge for myself that I believe could be solved by encoding an indirect command buffer with compute commands. The metal specification says there is a compute_command
type that works like render_command
. Oddly, the compute command is not recognized by the compiler, and I do not see the type in the header where the specification says it should be. Does anyone know if this has been removed or depreciated?
For a short background on my challenge, I am trying to leverage the GPU to process what would normally be a two-dimensional array on the CPU. My first attempt was to use a single compute function with a 2D thread group.
The problem with this solution is that the data structure is not perfectly square like a texture would be. The structure would result in something like the following:
0 | 1 | 2 | 3 | |
1 | X | X | -- | -- |
2 | X | -- | -- | -- |
3 | X | X | X | X |
Where X
represents threads where there is a valid subscript for both parts of the “array”. Since the whole group is dispatched at once, this solution leads to a crash because there are threads that will try to execute with at least one subscript that is out of bounds.
To resolve, I tried to return early if there was an access to a null. I am inexperienced with C++ so perhaps there is an easy way to do this, but I was unsuccessful. Plus, that solution could result in a lot of unused threads.
Alternatively, on the CPU side, I could loop through and dispatch a 1D thread group, but I believe that eliminates the usefulness of the GPU in this case. And the same issue if I made a loop on the GPU side.