Mathematics of the depth metric when generating shadow maps and rendering with shadows
This article has been written to keep track of how the depth metric is computed when generating shadow maps and rendering shadows in Babylon.js, as understanding how we ended up using the formula described below can be hard at times, especially when you throw into the mix the support of the reverse depth buffer and the reduced NDC Z range that WebGPU is using...
Generating the shadow map
Shadow maps are generated by the ShadowGenerator
class for standard shadows, and by the CascadedShadowGenerator
class for cascaded shadow maps.
The idea is to generate a texture that contains the depth of the geometry which is closest to the light when rendering the scene from the light point of view. Basically, this depth is the z coordinate of the 3D point when transformed into the view space of the light (in this space, the Z axis is going forward). So, if we have two points A and B, if zA < zB
in this space, A is closer to the light than B and it is zA
which will be written to the shadow map.
Babylon.js is not doing anything fancy here and is simply using the transformation matrix (view x projection) of the light to render the shadow casters and generate the shadow map. However, there are several cases to consider.
PCF and PCSS filtering
When using PCF (Percentage Closer Filtering) and PCSS (Percentage Closer Soft Shadows) to render actual shadows, we are using as our shadow map the depth texture generated by the GPU when rendering the shadow casters. So, this texture is automatically generated as part of the rendering and we have nothing specific to do, except applying the bias value defined in the shadow generator. The shader code looks like this (in the shadowMapVertexMetric.fx file):
#if SM_DEPTHTEXTURE == 1 #ifdef IS_NDC_HALF_ZRANGE #define BIASFACTOR 0.5 #else #define BIASFACTOR 1.0 #endif
#if SM_USE_REVERSE_DEPTHBUFFER == 1 gl_Position.z -= biasAndScaleSM.x * gl_Position.w * BIASFACTOR; #else gl_Position.z += biasAndScaleSM.x * gl_Position.w * BIASFACTOR; #endif#endif
SM_DEPTHTEXTURE
is set to 1 only when using PCF/PCSS filtering. biasAndScaleSM.x
is the bias value (note that the normal bias is applied earlier and modifies the world position of the 3D point).
We are multiplying by gl_Position.w
because the GPU, as part of its computations, will do gl_Position.z / gl_Position.w
before writing the value to the depth texture: by pre-multiplying by gl_Position.w
, we make sure the final result is simply biased by a constant biasAndScaleSM.x * BIASFACTOR
value.
When the NDC space has a 0..1
Z range (meaning IS_NDC_HALF_ZRANGE is defined), we use a bias factor of 0.5 so that the final bias applied to the position has the same scale than when the range is -1..1
.
Note that in the standard case (when not using the reverse depth buffer), we add the bias to the position, so we move a little farther the depth value / the geometry. There is another strategy that would be to not apply the bias in the shadow map but at the shadow rendering stage. In that case, we would subtract the bias from the current depth (from the light) of the pixel to achieve the same result.
Finally, when using the reverse depth buffer we simply reverse (negate) the bias offset as now bigger z values means nearer geometries.
Generating a depth metric
When not using PCF / PCSS modes (actually, we also need the depth metric described here in PCSS mode), we need to generate a depth metric, which is the depth value we will use when doing depth comparisons to compute the shadow level of a given pixel.
The computation we are doing to generate this value is (in the shadowMapVertexMetric.fx file):
#if SM_USE_REVERSE_DEPTHBUFFER == 1 vDepthMetricSM = (-gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;#else vDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;#endif
The aim is to generate a normalized value between 0..1
from the gl_Position.z
value, which is the z component of the 3D vertex after the transformation matrix (view x projection) has been applied.
In the next sections, we are going to explain which values to set in depthValuesSM.x
and depthValuesSM.y
to achieve this goal.
As a preamble, we will only focus on the projection matrix because we are only interested in how the projection remaps the z values to the NDC space and the view matrix does not come into play in this computation.
Notes:
- we could have used the depth texture described in the previous section in all cases to retrieve the depth values we need and avoid having to deal with this depth metric, but for historical reasons and because WebGL1 does not support depth textures, we need this depth metric.
- this section assumes the NDC Z range is
-1..1
. We will handle the0..1
range later. - the reverse depth buffer case is handled simply by swapping the near and far planes of the light in the projection matrix
- the projection matrices we are dealing with are for a left handed coordinate system but the results are the same for a right handed system
- the depth renderer is also using the same computation to generate the depth texture, so what we are describing below for the spotlight (perspective projection) is applicable to the depth renderer (the light being replaced by the camera).
Directional light
Directional lights are using an orthographic projection to transform points to NDC space (clip space to be precise). This projection is:
n
and f
are the near and far planes of the light (light.shadowMinZ
/ light.shadowMaxZ
if defined, camera.minZ
/ camera.maxZ
if not) respectively. Note that we are only interested in the transformation of the z coordinate, so we don't need the a
, b
, i0
and i1
values:
It's a linear function of z (which is something we want), but the range is not 0..1
when z takes values between n
and f
:
So the range is -1..1
. To remap this range to 0..1
we can simply add 1 to z and divide everything by 2.
Looking back at how vDepthMetric
is defined:
vDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;
(don't forget that z_ortho = gl_Position.z
)
We simply need to have depthValuesSM.x = 1
and depthValuesSM.y = 2
.
In the javascript code, the depthValuesSM
shader variable is set like this:
effect.setFloat2("depthValuesSM", this.getLight().getDepthMinZ(scene.activeCamera), this.getLight().getDepthMinZ(scene.activeCamera) + this.getLight().getDepthMaxZ(scene.activeCamera));
So:
depthValuesSM.x = this.getLight().getDepthMinZ(scene.activeCamera);depthValuesSM.y = this.getLight().getDepthMinZ(scene.activeCamera) + this.getLight().getDepthMaxZ(scene.activeCamera);
Which means that for directional lights, getDepthMinZ
must return 1
and getDepthMaxZ
must also return 1
.
In the reverse depth buffer case:
This time the range is 1..-1
. However, in the shader, for the reverse depth buffer case we have:
vDepthMetricSM = (-gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;
which means z_ortho
is multiplied by -1
before the addition with depthValuesSM.x
. So, 1..-1
is becoming -1..1
and we are now back to the same case than previously, so we need the same values in depthValuesSM.x
and depthValuesSM.y
(that is, 1 and 2 respectively).
Spot light
Spot lights are using a perspective projection to transform points to NDC space (clip space to be precise). This projection is:
Regarding the range when z takes values between n
and f
:
The range is -n..f
, which means that for spot lights we need getDepthMinZ
to return n
and getDepthMaxZ
to return f
to remap this range to 0..1
once we apply the computation (recall that depthValuesSM.x = light.getDepthMinZ()
and depthValuesSM.y = light.getDepthMinZ() + light.getDepthMaxZ()
):
vDepthMetricSM = (gl_Position.z + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;
As for the directional light case, the reverse depth buffer range is negated compared to the normal case, but because of the minus sign in front of gl_Position.z
in the vDepthMetricSM
formula, getDepthMinZ
and getDepthMaxZ
must return the same values.
Point light
Point lights are using shadow maps that are storing the distance of the geometry to the light. This distance is computed as length(position - lightPosition)
, which is then remapped to the 0..1
range (in the shadowMapFragment.fx
file):
depthSM = (length(vPositionWSM - lightDataSM) + depthValuesSM.x) / depthValuesSM.y + biasAndScaleSM.x;
It's the same computation than previously described except that we are using the distance to the light instead of the depth. vPositionWSM
is the world position of the point and lightDataSM
the world position of the light. There's no specific case for the reverse depth buffer mode as it is irrelevant: we are computing a distance, not a depth.
Note that in reality we are not remapping to 0..1
with this formula because length(vPositionWSM - lightDataSM)
has no maximum bound, it can go to +infinity: vPositionWSM
is constrained to be in the view frustum but the light can be positioned anywhere in the world. So, to simplify things, we setup the getDepthMinZ
and getDepthMaxZ
functions to return the same values than in the spot light case, meaning n
and f
respectively. It's not really important that we are not remapping strictly to 0..1
as long as we use the same computation when rendering shadows, so that both values can be compared.
Notes:
- Even if
length(position - lightPosition)
can go to +infinity in theory, the point light is generally not too far from the geometry which is currently in the view frustum because for lights that would be too far their contributions would be very small (or 0) and the light would not cast shadows anyway (every (point) light as a maximum distance after which it falls to 0 intensity) - we could remove the remapping altogether and simply use
length(position - lightPosition)
as the depth metric, but that would require using a float texture in all cases. When in WebGL1 mode and if the float texture extension is not supported, we are using a UNORM 8 bits texture, so we need a0..1
remapping
Generating a depth metric (NDC 0..1
Z range)
When using a NDC space where the z coordinate is in the 0..1
range, the orthographic and perspective projection matrices do change. Let's see how it changes the results from the previous section.
Directional light
We can see that in the non reverse depth buffer case the remapping is already 0..1
, so getDepthMinZ
should return 0 and getDepthMaxZ
should return 1.
In the reverse depth buffer case, as we have a negation of z in the vDepthMetric
formula, the z_ortho
range is -1..0
. We need to add 1 to remap to 0..1
. To do that, we can simply have getDepthMinZ
return 1 and getDepthMaxZ
return 0.
Spot light
In the non reverse depth buffer case, we need to remap 0..f
to 0..1
: we need to divide by f
. To do that, getDepthMinZ
should return 0 and getDepthMaxZ
should return f
.
In the reverse depth buffer case, we need to remap -n..0
(don't forget that when using the reverse depth buffer we have -gl_Position.z
in the vDepthMetric
formula, not gl_Position.z
) to 0..1
: we need to add n
and divide by n
. To do that, getDepthMinZ
should return n
and getDepthMaxZ
should return 0.
Point light
Point lights are no different than in the NDC -1..1
range case because we are exclusively dealing with distances, the NDC z range is irrelevant.
Shadow rendering
There's not much to say regarding the shadow rendering part: we simply have to make sure we use the exact same formula to compute the depth metric of the current pixel than the ones used to generate the shadow maps. The shader code used to compute the vDepthMetric
value is in this case (in the shadowsVertex.fx
file):
#if USE_REVERSE_DEPTHBUFFER vDepthMetric{X} = (-vPositionFromLight{X}.z + light{X}.depthValues.x) / light{X}.depthValues.y;#else vDepthMetric{X} = (vPositionFromLight{X}.z + light{X}.depthValues.x) / light{X}.depthValues.y;#endif
So, we must pass in light{X}.depthValues.x
and light{X}.depthValues.y
the same values that we passed in the depthValuesSM.x
and depthValuesSM.y
parameters when generating the shadow maps.
To sum up
First recall that:
depthValues.x = light.getDepthMinZ(camera);depthValues.y = light.getDepthMinZ(camera) + light.getDepthMaxZ(camera);
and that n
is the near plane distance and f
the far plane distance (light.shadowMinZ
/ light.shadowMaxZ
if defined, camera.minZ
/ camera.maxZ
else):
Directional light | Spot light | Point light | |
---|---|---|---|
NDC -1..1 | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |
 | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |
NDC -1..1 + reverse depth buffer | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |
 | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |
NDC 0..1 | Directional light minZ=0 | Spot light minZ=0 | Point light minZ=n |
 | Directional light maxZ=1 | Spot light maxZ=f | Point light maxZ=f |
NDC 0..1 + reverse depth buffer | Directional light minZ=1 | Spot light minZ=n | Point light minZ=n |
 | Directional light maxZ=0 | Spot light maxZ=0 | Point light maxZ=f |
In this table, minZ
is for getDepthMinZ
and maxZ
is for getDepthMaxZ
.