Comments on Advances in Real Time Rendering 2011

I final­ly got around to write some com­ments on this years Advances in Real Time Ren­der­ing held at SIGGRAPH 2011. Thanks to the RTR-team for mak­ing the notes avail­able. The talk about phys­i­cal­ly-based shad­ing in Call Of Duty has already been men­tioned in my pre­vi­ous post. So, in no par­tic­u­lar order:

Rendering in Cars 2
Christopher Hall, Robert Hall, David Edwards (AVALANCHE Software)

At one point, the talk about ren­der­ing in Cars 2 describes how they use pre-exposed col­ors as shad­er inputs to avoid pre­ci­sion issues when doing the expo­sure after the image has been ren­dered. I have employed pre-exposed col­ors with dynam­ic expo­sure in the past, and I found them tricky to use. Since there is a delay in the expo­sure feed­back (you must know the expo­sure of the pre­vi­ous frame to feed the col­ors for the next frame) you can even get expo­sure oscil­la­tion!

Two uses of Vox­els in LittleBigPlanet2’s graph­ics engine
Alex Evans and Anton Kir­czenow (Medi­aMol­e­cule)

Lots of intel­li­gent ideas. Unfor­tu­nate­ly, most of them only real­ly work for the spe­cial case of the 2.5 dimen­sion­al world that LBP plays in. The best take­away for gen­er­al­iza­tion I think is the dynam­ic AO.

Making Game Worlds from Polygon Soup
Hao Chen, Ari Silvennoinen, Natalya Tatarchuk (BungiE)

This talk is about the spa­tial data­base orga­ni­za­tion for HALO Reach, and the struc­ture that they set­tled for is the poly­gon soup. All I can say is +1! An unstruc­tured poly­gon soup/object soup with a spa­tial index on top of it, for exam­ple, a loose octree [1], con­nect­ed to a por­tal sys­tem is imho the way to go to orga­nize 3D data for a dynam­ic world. I did exact­ly this in the past and it served me well [3]. Dynam­ic occlu­sion culling in Greene/Zhang-style [2,6] can be added on top of it very eas­i­ly, and in effect, this is what the Umbra mid­dle­ware pro­vides with soft­ware ren­der­ing. I like the idea of the auto­mat­ic por­tal­iza­tion that is pre­sent­ed in the talk. Man­u­al por­tal­iza­tion is indeed cum­ber­some and error-prone for the artists, as they cor­rect­ly state with their out­door exam­ples. Their solu­tion is sim­i­lar to the way nav­i­ga­tion mesh­es are built, via a flood-fill of a uni­form grid and cell aggre­ga­tion.

Secrets of CryENGINE 3  Graphics Technology
Tiago Sousa, Nickolay Kasyan, and Nicolas Schulz (Crytek)

A nice sum­ma­ry of all the lit­tle details that when com­bined, make a great ren­der­er. Inter­est­ing­ly, this is anoth­er case for dynam­ic occlu­sion culling via read­back of the z-buffer [4]. Do not pass go, do not use occlu­sion queries! Occlu­sion queries are a mis­nomer, because they’re the least usable for what they were invent­ed for. When I did this, the laten­cy incurred by the z-buffer read­back was unnot­i­ca­ble on a GeForce2. I did a sim­ple glReadPixels(GL_DEPTH_COMPONENT) over a small view­port of the frame­buffer where the occlu­sion geom­e­try was ren­dered, and it was not slow­ing things down. Today you would have at least some laten­cy, but once the z infor­ma­tion is avail­able on the CPU, it can be used for nice things like deter­min­ing the num­ber of shad­owmap cas­cades to cov­er the vis­i­ble depth range.
Anoth­er thing of note is how the authors describe a tech­nique for blur­ring alpha-test geom­e­try for hair. It looks very nice but also seems to be a bit expen­sive. For Vel­vet Assas­sin, we used some­thing that I named ‘dual-blend’. Just use alpha test for the sol­id parts of the tex­ture, and in a sec­ond pass, alpha-blend (with an invert­ed alpha test) just for the pix­els that have an inter­me­di­ate alpha val­ue, while the artists made sure the order­ing of the tri­an­gles is from inner to out­er.

More Performance!
John White (EA Black Box), Colin Barre-Brisebois (EA Montreal)

The most dense­ly packed and inspir­ing pre­sen­ta­tion of them all, with most of the top­ics com­plete­ly unique and new. There is Sep­a­ra­ble Bokeh Depth-of-Field, a nice tech­nique to make a sep­a­ra­ble blur of uni­form hexag­o­nal shape with as few pass­es as nec­es­sary. I think it would make sense to com­bine Bokeh and bloom with FFT. This is because the cor­rect shape of the bloom ker­nel is the FFT of the cam­era aper­ture (at least in the far-field approx­i­ma­tion). So in this case, bloom would have to be a 6-point­ed star.
Then they describe a cool trick, Hi-Z / Z-Cull Reverse Reload. Remem­ber how they told you not to reverse the depth-com­par­i­son dur­ing a frame, because that would inval­i­date Hi-Z? This is how to do it any­way. Also notable are Chro­ma Sub-Sam­pled Image Pro­cess­ing, and how they imple­ment­ed Tile-Based Deferred Shad­ing [5].

Dynamic lighting in God of War 3
Vassily Filippov (Sony Santa Monica)

This talk is mostly about aggregating multiple lights in the vertex shader, so the pixel shader only has to calculate a single specular highlight. It then goes to great lengths to make sure the result looks reasonable.

This is inter­est­ing, because I did some­thing sim­i­lar in the past, and I would like to elab­o­rate more on it. I can’t remem­ber that I had the numer­ous prob­lems men­tioned in the talk. How­ev­er, I did the dif­fuse light­ing entire­ly per ver­tex which may explain the difference.The data that I inter­po­lat­ed across tri­an­gles was essen­tial­ly this:

half3  HemiDiffuse      : COLOR0;     // Diffuse hemisphere lighting
half3  HemiSpecular     : COLOR1;     // Specular hemisphere lighting
half3  SunDiffuse       : TEXCOORD0;  // Diffuse lighting for the sun
half3  SunH             : TEXCOORD1;  // Half-angle vector for the sun
half3  PointsDiffuse    : TEXCOORD2;  // Agg. diffuse lighting from points
half3  PointsH          : TEXCOORD3;  // Agg. half-angle vector from points
half3  N                : TEXCOORD4;  // Surface normal

The pix­el shad­er did noth­ing more than cal­cu­late the shapes of the spec­u­lar high­lights and mul­ti­ply it with the appro­pri­ate col­ors. The spec­u­lar pow­er and the nor­mal­iza­tion fac­tor was a fixed con­stant.

// pixel shader:
 
half2 highlights;
highlights.x = pow( saturate( dot( N, SunH ) ), SPEC_POWER );
highlights.y = pow( saturate( dot( N, PointsH ) ), SPEC_POWER );
highlights *= SPEC_NORMFACTOR;
 
half3 result = 0;
 
result += DiffTexture * (
    In.HemiDiffuse + In.SunDiffuse + In.PointDiffuse );
 
result += SpecTexture * (
    In.HemiSpecular +
    In.SunDiffuse * highlights.x +
    In.PointsDiffuse * highlights.y );

Now on to the inter­est­ing thing, the actu­al aggre­ga­tion. The key was to accu­mu­late the col­ors weight­ed by cosine term and atten­u­a­tion, but to accu­mu­late the direc­tion weight­ed by atten­u­a­tion only.

// vertex shader:
 
float3 PointsL = 0;
for( int i = 0; i < NUM_POINTS; ++i )
{
    half3 dx = LightPosition[i] - WorldPosition;
    half attenuation = // some function of distance
    half3 attenL = normalize( dx ) * attenuation;
    Out.PointDiffuse += saturate( dot( N, attenL ) ) * LightColor[i];
    PointsL += attenL;
}
 
Out.PointsH = normalize( PointsL ) + V;

The sum of the weight­ed L-vec­tors is then con­vert­ed to a half-angle vec­tor and inter­po­lat­ed across the poly­gon. This works because the shad­er does not do per-pix­el dif­fuse, and instead, the \mathbf{N} \cdot \mathbf{L} term is applied before sum­ma­tion in the ver­tex shad­er. Oth­er­wise, if the pix­el shad­er would need to do \mathbf{N} \cdot \mathbf{L}, it would no longer be pos­si­ble to use the cosine-weight­ed sum of light col­ors, giv­ing rise to all the prob­lems men­tioned in the talk.


[1] Thatch­er Ulrich, “Notes on spa­tial par­ti­tion­ing”
http://tulrich.com/geekstuff/partitioning.html
[2] Han­song Zhang, “Effec­tive Occlu­sion Culling for the Inter­ac­tive Dis­play of Arbi­trary Mod­els”, UNC-Chapel Hill
http://www.cs.unc.edu/~zhangh/hom.html
[3] Chris­t­ian Schüler, “Build­ing a Dynam­ic Light­ing Engine for Vel­vet Assas­sin”, GDC Europe
http://www.gdcvault.com/free/gdc-europe-09
[4] Stephen Hill, Daniel Col­in, “Prac­ti­cal, Dynam­ic Vis­i­bil­i­ty for Games”, GPU Pro 2
http://gpupro2.blogspot.com/
[5] Andrew Lau­ritzen, “Deferred Ren­der­ing for Cur­rent and Future Ren­der­ing Pipelines”
http://software.intel.com/en-us/articles/deferred-rendering-for-current-and-future-rendering-pipelines/
[6] Ned Greene, “Hier­ar­chi­cal Z-Buffer Vis­i­bil­i­ty”
http://www.cs.princeton.edu/courses/archive/spr01/cs598b/papers/papers.html

2 thoughts on “Comments on Advances in Real Time Rendering 2011

  1. Hi,
    I’d be very inter­est­ed to know more about com­bin­ing the bloom and bokeh with FFT. Do you have any ref­er­ences on this?

    • Hal­lo John,
      what I meant was that the shape of the bloom and bokeh ker­nels are relat­ed to one anoth­er by the Fouri­er trans­form, which is the far-field approx­i­ma­tion to dif­frac­tion. So, say, if you were to bloom an image with FFT, you obvi­ous­ly need the FFT-image of the bloom ker­nel. By law of physics, then, this should auto­mat­i­cal­ly be viewed as the non-FFT bokeh ker­nel (and vice ver­sa). I remem­ber some authors wrote a paper on this top­ic around 2005 (+/-), for auto­mat­ic gen­er­a­tion of bloom ker­nels from aper­ture geom­e­try, but I can’t find it any­more. Here is a talk on the same sub­ject: http://nae-lab.org/~kaki/paper/PG2004/Kakimoto2004GlarePresen.pdf

Leave a Reply

Your email address will not be published.