Having made the CPU-bound particle system, I now wanted to try making a GPU-bound particle system during the Tools course of the education at TGA.
I implemented it using our engine, RatTrap, where we could add it as a component to objects.
My goal was to have it be able to run live in editor, so the procedural artists can easier see what they do and how it looks. Simulate collision with at least the terrain, as well as bone parenting.
Since I never had used Compute Shaders before this the first step was making sure I could update particles through the GPU. So at a start I just had 1000 particles that existed on one specific position, and when they got to low moved back to the start position. This allowed me to see the particles being updated with a gravitational pull, while also making sure I get them to render.
The step after that was setting up a way to spawn particles under runtime. Which is something I tested, and rewrote a couple of times during the development of the Particle System.
The biggest problem I had with the spawning was determining which particles where dead, the idé was to use a Dead List for that, but when using a AppendConsume Buffer I got the problem that in DX11 while Consume is thread safe, it doesn't prevent you from consuming an empty buffer. Combined with the fact that you can't get the current size of the buffer in the shader, since that part is hidden. So I tested various ways to solve that problem, one of them was allocating blocks of the particle buffer to each emitter based of on how many they would need, while also doing so with the dead buffer that was changed to a Read Write Buffer. The problem I got with this was the fact that even if an emitter wasn't active, for example blood splatter, it still took up space in the particle buffer.
In the end I decided to use an Append Consume Buffer for the dead list but saving the size of the buffer in a Read Write Buffer. While I need to make sure it's synced when particles spawn and die, it prevents the dead list size from under flowing, by using thread safe cheacks.
One of the more challenging moments in the creation of the particle system was simulating physics for the particles on the GPU. Since this was my first time working with Compute Shaders, I was thinking more of how you do it on the CPU than the GPU, and since we had a Height Map for the Terrain I figured I could make it into a texture for the GPU.
Since the Height Map was split into chunks I made a texture for each chunk, the texture only containing the height. Then I sampled that on the GPU using a SampleTerrainHeight function and used that to simulate collision with the ground, which worked pretty well.
With this the particles could have collision with the terrain, no matter where the player was. However while it gave us collision with the terrain it didn't give us collision with static objects such as walls, floors, ceilings, rocks, etc. While also storing a texture for every chunk of the height map, which wouldn't scale well if we have a large world.
An other way to simulate collision with the world would to be instead use the previous frames depth buffer. If you have double buffer rendering that would also make it easy to get the previous depth buffer, which we don't have.
Using the depth buffer for collision would not only give us collision with the terrain but also collision with structures and even objects such as the player or enemies. Which is why it's often the go to solution for particle collision.
The draw back is that you only get collision with the objects that was in the previous frames depth buffer, resulting in particles missing collision when they do actually collide with something. However this is mostly fine, since with particles it's less about realism and more about it looking reasonable.
At this point I haven't implemeted it yet, but it is in the works, when that is done the plan is to have both the height map collision check while also having the depth map collision check.
On the CPU we can easily get a random value with C++ using the Rand function. However on the GPU, with DX11 we do not have a Rand function, nor an easy way to get random numbers. So how do we get random numbers on the GPU?
One way would be to have the CPU create the random numbers and sent that to the GPU, however that would have us send multiple random numbers for one particle, or reuse the random numbers for multiple particles. Which would be very inefficient.
Instead it would be better to have the CPU give us a number and then combine it with the Particle Index to create a seed we utilize in either Xorshift or in a Linear congruential generator (LCG) to generate random numbers.
I decided to use Xorshift, since it gives better random values. When creating the seed instead of getting a random number from the CPU I instead get the frame index. Which while giving me slightly less random values, makes it more deterministic.
The GIF showcase the usage of the random values for the particles in our game Walter Volt, followed by the code for the random below. One thing that is very nice with this implementation is that even if we set max as a value lower than min, it will still get a random value between the two since we utilize lerp. While the only truly random value we get is between 0 and 1 to determine where between the two numbers we are.
When making the random I also made it so we input a Min and a Max value. With that as a base I started working on making it so we could control the values with a curve. Which we can define through the editor.
I started with having the curves being able to affect the most visual element of the particle, namely the color and alpha. Later I added it to the size of the particle as well.
This was to easily be able to verify that the curve works properly, both for the editor and when using it on the GPU.
The editor for the curves was made using ImPlot, and only the control points is stored in a JSON file. While the curve is constructured under runtime for the editor. For the Particle system its constructed under start, or when the editor updates it. Where the we store it on the GPU as a Texture 2D Array of floats.