Shadow map techniques are employed in computer graphics to improve visual realism. The shadow map algorithm requires a two pass render of a scene – once from the light’s perspective and a second from the camera’s perspective. The render from the light’s perspective records the depth values from the light source to each object in the scene. The depth values are written to a pbuffer, then stored in a texture where it can later be sent to the fragment shader as a uniform sampler. When rendering the scene from the camara’s view, each screen pixel is transformed to light coordinates where its zvalue is compared to the value stored in the shadow map. This compare operation is performed in the fragment shader using the shadow2DProj function. If the pixel’s zvalue is higher than the depth value of the shadow map then that pixel is in shadow. If the pixel’s zvalue is lower than the depth value stored in the shadow map then the pixel is not in shadow.
One inherent side effect of shadow maps is a hard, aliased edge around the shadow. Building soft edged shadows using percentage closer filtering and pixel shader branching is an visual enhancement to the shadow map that employs optimization mechanisms for calculating a soft penumbra region around a shadow where we previously had a hard, aliased edge. Below are two example scenes that illustrate the visual enhancement of soft edged shadows.
Tparty Scene with Aliased Edge. 
Tparty Scene with Soft Edge. 
Abstract Shapes Scene with Aliased Edge. 
Abstract Shapes Scene with Soft Edge 
To create the soft edge, we employ William Reeves algorithm called percentage closer filtering. With this technique we take a large number of samples, 32 to 64, on a region of the depth map that surround a given screen pixel. Based on the results of sampling, we determine an amount, or percent, which is used to attenuate the intensity of the shadow. By performing this calculation on every pixel in the scene the appearance of a soft edge is created on shadow’s edge as we move from 100% in shadow to 0% in shadow. To illustrate this algorithm:
pixel zvalue  depth map region values  binary value where 1 = in shadow and 0 = not in shadow 
percentage in shadow  




Given a pixel with a zvalue of 35 and a 3x3 depth map region where six depth measurement are 40 and three depth measurements are 30. The locations where the shadow depth is 40, our pixel is closer, so those locations are not in shadow. However, the three depth measurements of 30 are closer than our pixel, so these three locations are in shadow. With three locations in shadow and six not in shadow, the total shaded contribution is 3/9 = 33%.
Pseudocode:  
Let numberInShadow = 0  
Let numberOfSamples = 9  
For each value in depth map region  
Do depth comparison with pixel zvalue  
If pixel zvalue < depth map value  
Then numberInShadow += 0  
Else if pixel zvalue > depth map value  
Then numberInShadow += 1  
Let percentInShadow = numberInShadow / numberOfSamples 
SoftEdgedShadow.frag:  
uniform sampler2DShadow texture0; // shadowMap  
uniform sampler3D texture2; // jitterMap  
int totalSamples = 64;  
vec4 shadowMapCoord = gl_TexCoord[0];  
float filterSize = .48;  
vec2 scaleJitterxy = vec2(.5, .5);  
vec3 jitterCoord = vec3(gl_FragCoord.xy * scaleJitterxy, 0.0);  
float shadow = 0.0;  
for (int i = 0; i < totalSamples/2; i++)  
{  
vec4 offset = (texture3D(texture2, jitterCoord) * 2.0)  1.0;  
jitterCoord.z += 1.0/32.0;  
shadowMapCoord.xy = offset.xy * filterSize + gl_TexCoord[0].xy;  
shadow += shadow2DProj(texture0, shadowMapCoord);  
shadowMapCoord.xy = offset.zw * filterSize + gl_TexCoord[0].xy;  
shadow += shadow2DProj(texture0, shadowMapCoord);  
}  
shadow /= totalSamples; // Percentage in shadow 
According to Yury Uralsky’s contribution in the book GPU Gems 2, sampling a depth map this way produces a banding effect at the shadow edges. This is an undesirable effect and can be corrected by offsetting the sample location with controlled randomness, or jittering. The idea is that we divid the sample region into subregions and randomly sample one location within each subregion. The depth value recorded from with this new subregion is now compared to the pixel zvalue. Employing jittering results in a smooth, soft penumbra that resembles noise rather than banding.
Regular Sample Box  Jittered Sample Box  


Pseudocode: 
// Jitter the sample box 
box[0] += ((float)(ran1()*2)1) * (0.5f/uSamp); 
box[1] += ((float)(ran1()*2)1) * (0.5f/vSamp); 
box[2] += ((float)(ran1()*2)1) * (0.5f/uSamp); 
box[3] += ((float)(ran1()*2)1) * (0.5f/vSamp); 
Implementing this algorithm would require 32 to 64 offset calculations for each set of zvalue comparisons for a single pixel. To preserve CPU computation cycles, offsets are precalculated and stored in a 3D texture map.
glTexImage3D(GL_TEXTURE_3D, 0, GL_RGBA, size, size, uSamp * vSamp / 2, 0, GL_RGBA, GL_BYTE, img);
The tradeoff here is that we save on computation time, but we consume more memory. However, we decrease memory requirements for storing this 3D jitter map by storing a pair of offsets in each texel. Offsets occur in the x and y direction, so we store one set of x and y offset values in the r and g portion of the pixel data structure, and a second set of x and y offset values in the b and a portion of the pixel data structure. Storing offset data in this way reduces necessary texture map size in half.
Performing 32 to 64 samples for each screen pixel still requires an extensive amount of processing in the fragment shader. We use the services of another optimization technique called pixel shader branching to improve the efficiency of the percentage closer filtering algorithm. The key idea is that most pixels in the scene are either entirely in shadow or entirely out of shadow, only a relatively small number of pixels actually reside in the penumbra region of the shadow. So, rather than taking 32 to 64 samples blindly, we first perform a small number of samples, 8 to 16. If all 8 or 16 off these depth test samples return 100% in shadow or 100% out of shadow (all 0’s or all 1’s), then we assume that the pixel does not reside in the penumbra region of the shadow, so we do not perform all 32 or 64 samples. On the other hand, if some of these 8 or 16 depth test samples return in shadow and some return not in shadow (some 0’s and some 1’s), then we assume that the pixel does lie in the penumbra so we continue performing the entire 32 to 64 samples. This algorithm has been implemented in the project code fragment program in a very similar manner as shown in the book GPU Gems 2, paage 279. The Pseudocode code for this algorithm is as follows:
Do 4 to 8 samples  
Lookup offset  
Get shadow map coordinates with offset.xy values  
Do depth comparison and store result  
Get shadow map coordinates with offset.zw values  
Do depth comparison and store result  
If stored result not equal to 0 or 8 then  
Do 32 to 64 samples  
Lookup offset  
Get shadow map coordinates with offset.xy values  
Do depth comparison and store result  
Get shadow map coordinates with offset.zw values  
Do depth comparison and store result  
Else  
Not in the penumbra, no further samples required/td>  
Add shadow percentage to final fragment color 
SoftEdgedShadowPSB.frag:  
int totalSamples = 64;  
int testSamples = 8;  
vec4 shadowMapCoord = gl_TexCoord[0];  
float filterSize = .48;  
vec2 scaleJitterxy = vec2(.5, .5);  
vec3 jitterCoord = vec3(gl_FragCoord.xy * scaleJitterxy, 0.0);  
float shadow = 0.0;  
for(int i = 0; i < testSamples/2; i++)  
{  
vec4 offset = (2.0 * texture3D(texture2, jitterCoord))  1.0;  
jitterCoord.z += 1.0/32.0;  
shadowMapCoord.xy = offset.xy * filterSize + gl_TexCoord[0];  
shadow += shadow2DProj(texture0, shadowMapCoord);  
shadowMapCoord.xy = offset.zw * filterSize + gl_TexCoord[0];  
shadow += shadow2DProj(texture0, shadowMapCoord);  
}  
shadow /= testSamples;  
if((shadow  1) * shadow != 0)  
{  
for(int i = 0; i < totalSamples/2; i++)  
{  
vec4 offset = (2.0 * texture3D(texture2, jitterCoord))  1.0;  
jitterCoord.z += 1.0/32.0;  
shadowMapCoord.xy = offset.xy * filterSize + gl_TexCoord[0];  
shadow += shadow2DProj(texture0, shadowMapCoord);  
shadowMapCoord.xy = offset.zw * filterSize + gl_TexCoord[0];  
shadow += shadow2DProj(texture0, shadowMapCoord);  
}  
shadow /= (totalSamplestestSamples);  
} 
Performace of pixel shader branching seems dependent on the complexity of the geometry and the amount of penumbra within a given scene. Rendering a scene once with pixel shader branching implemented and without pixel shader branching yielded the following results:
Draw Times: PSB Not Implemented: 0.098 PSB Implemented: 0.065 

Draw Times: PSB Not Implemented: 0.184 PSB Implemented: 0.061 