The approach to ICB command creation from chapter 15 will generate a draw_indexed_primitives command for every instance of a given mesh.
Now I want to implement instanced rendering, as described in chapter 13. What would be the best approach for implementing this in a GPU-driven render loop?
I’ve looked at the various WWDC presentations as well as the various Apple example projects and could not find a draw_indexed_primitives invocation with an instanceCount higher than 1.
The only approach that I can think of right now is to
group the meshes on the CPU first
run the ICB encoding kernel separately for each mesh type with a potential instance count higher than 1, (“potential” in the sense that the instance may be occluded or frustum-culled)
run the ICB encoding kernel for all remaining meshes that have an instance count of, at most, 1
This approach should work, but is less than satisfactory, given that I’m aiming for a fully GPU-driven render loop.
So I’m wondering if there may be other techniques for achieving instanced rendering that do not rely on the CPU. Any ideas?
Hi Caroline, thanks for replying. The book has been a great help, BTW.
The draw call itself isn’t the problem. Once you have determined which instances need to be rendered, you can set up the matching buffers and instance count.
Having thought about it some more, the solution seems to be to encode the compute dispatches on the GPU.
This is briefly touched upon in the WWDC19 Metal presentation, around the 41:30 mark.
So basically this part of the render loop would have to be moved to its separate kernel on the GPU:
Admittedly, this is all well beyond the scope of chapter 15.