Comments on Dead Voxels: Has someone tried this before?

I was able to do a quick and dirty implementation ...

2009-12-14T06:40:07.774-08:00

I was able to do a quick and dirty implementation of this -- here are the results.

Good numbers, and interesting that it might prove ...

2009-08-15T08:45:23.943-07:00

Good numbers, and interesting that it might prove cheaper in the long run. As far as downsampling Z vs merely rerendering... I suppose that depends on density of geometry. Let's face it, a first step of downsampling Z probably isn't really that expensive (though admittedly the contents are a bit dubious). I suppose some knowledge of the scene is required to choose.

The only real problem I see with the stippling method is that it really depends on only a small number of overdrawn samples at the particular pixel as it's first 4 in claim the 2x2 square. Either more regular division in Z or some kind of clustering-type method to put the detail where you want it would seem more desirable.

If you just went with straight up 16 bit floating ...

2009-08-14T06:00:51.356-07:00

If you just went with straight up 16 bit floating point buffers (so just 3 rgb sh), you're still under the bandwidth of the Crytek solution (at 2.7M).

Yeah, I was thinking of using a cubemap on platfor...

2009-08-14T05:56:16.236-07:00

Yeah, I was thinking of using a cubemap on platforms where there is no SPU and it probably would still be faster to use the GPU to render the irradiance buffer.

Something else you could do if you're going to end up using a discontinuity-aware filter anyway is borrow from inferred lighting and for your front slice, render your layers of transparency stippled but still in a separate buffer from the opaque irradiance. As you point out, translucency probably could due without as high frequency of lighting anyway. This would allow you to get 5 depth slices per-lighting element (lixel?).

As far as Crytek's solution goes, its local in the sense that cascaded shadow maps are local (so there is some cutoff distance, and results are even lower frequency closer to that distance). Which is still a pretty decent ways out from the camera. They also do throw some direct lighting in there, mainly as a LOD technique (I think it could be either artist-controlled or based on heuristics of light size vs cell size). I've mostly been interesting in LPVs for indirect lighting, but not from a dynamically computed source, but from a precomputed irradiance volume. This would offer some interesting opportunities because you could prefilter this static data at different resolutions ahead of time, and then its a simple resampling problem to generate the volume textures (or to populate the irradiance buffer you propose with indirect lighting).

Interestingly enough, if you do the math assuming you can get away with quarter-res lighting (assuming a 1280x720 render resolution and a 320x180 irradiance buffer size)

320x180x4 channels (rgb sh + scaling coefficient for 3 channels) x 2 buffers (one for opaque, one for stippled translucency) = 1.8M of bandwidth (assuming 4 byte render targets)

Crytek's LPVs with 6 cascades. 32x32x32x4 channels x 6 cascades = 3 M of bandwidth

I think most of the videos they have use 2 cascades, but I guess we're not talking unreasonable amounts of bandwidth here.

Is your plan to render the first depth pass at the lower resolution and then just toss that depth buffer for the "true" geometry pass at full res? Seems simpler than attempting to downsample.

Good suggestion - cubemaps would work great runnin...

2009-08-13T22:03:28.890-07:00

Good suggestion - cubemaps would work great running on the GPU to accelerate things, as you could embed the whole 4-coefficients and avoid the rotation. Unfortunately, I don't think they'd necessarily translate well to the SPU as memory's a huge concern there but meanwhile computation is cheap as dirt. Since one of the greater strengths of the technique is how easily you can push the work to SPU (while the GPU starts on something else or takes on a piece of the work) that would seem a problematic choice there. Six of one, half dozen etc.
You're right about the two layers quite possibly not being enough for general transparency. Still, you could actually cheat that in a number of ways:
a) if you assume the particles are clustered in depth, calc a max/min specific to the particle cluster and lerp across that.
b) if they're more distributed, cut Z into more regular slices. Truth is, you can probably get away with VERY low res lighting for the translucency comparitively, so you can probably trade height/width for depth.
If I understood correctly, they're only rendering the VPLs into the radiant cube and not directly evaluating the direct lighting in it. As well, Crytek's solution is really tailored towards small areas (they mention in the presentation it's really meant for smaller indoor regions) otherwise they'd certainly suffer from a similar problem, as their cube is only 32x32x32.
On to the precision question... well, 8 bpc is pretty low, and you'd probably need some kind of per component scale factor (basically some scaling constants) to get a decent range. But, realistically, the Radiosity Normal Maps we presently use are all only 8bpc themselves with a single scalar scale factor. So I think the inaccuracies can be survived and shouldn't be too problematic. I'm pretty sure there are talks out there on the web actually going over the best way to quantize each of the first 4 SH bands. Regardless, I suppose if you wanted to get really fancy you could build a simple scale map.
And lastly, performance. You know me... performance is always my prime concern with all this stuff. I think the real performance win is that the buffer can be quite lower res than the actual frame buffer (or a comparable deferred light buffer), so despite it's extra weight and the additional computation I think it trades off quite well. Compare the typical resolution of lightmaps to the actual local resolution of the frame and you'll see what I mean. It wouldn't surprise me if you could actually trivially drop the SH buffer to a quarter in each dimension of the real frame. Compare that with a typical Prelight Pass lighting solution where you actually have to light at *double* res to account for MSAA.... I think it's quite possible the SH solution ends up cheaper overall - of course there's really only one way to find out...

Interesting idea. BTW, what about using a cube map...

2009-08-13T08:23:39.879-07:00

Interesting idea. BTW, what about using a cube map lookup on light direction to construct the SH? Might be faster than either performing the rotation or constructing the basis.

In some ways this is similar to what Crytek is doing but instead of creating a full 3D volume texture of SH, you're creating an "adaptive slice" between the frontmost and backmost lit pixels and lerping inbetween. I imagine if you have a big depth range lerping between front and back may not give you results you really want for lit translucency in the middle (I'm imagining a shooter with stuff potentially very far away, and two effects inbetween).

You might be able to borrow a technique from Crytek and produce glossy reflections of the translucency on your opaque stuff -- they essentially do a bound ray march in their 3d SH volume accumulating irradiance along a direction. You could do a single ray along a direction with your scheme and achieve an approximation of that effect.

Another question I'd have is if 8 bits per component is going to be enough to encode the dynamic range (probably depends on your expected scene).

I think the tradeoff here is even with a smaller lighting buffer, lights are probably more expensive to evaluate than traditional deferred lighting (just my gut feeling). But you're saving a lot of geometry costs due to a full-speed depth pass, and you get full resolution lighting for transparency. So fixed overhead is probably smaller.