Comments on Advances in Real Time Rendering 2011

I finally got around to write some com­ments on this years Advances in Real Time Ren­der­ing held at SIGGRAPH 2011. Thanks to the RTR-​team for mak­ing the notes avail­able. The talk about physically-​based shad­ing in Call Of Duty has already been men­tioned in my pre­vi­ous post. So, in no par­tic­u­lar order:

Ren­der­ing in Cars 2
Christo­pher Hall, Robert Hall, David Edwards (AVALANCHE Software)

At one point, the talk about ren­der­ing in Cars 2 describes how they use pre-​exposed col­ors as shader inputs to avoid pre­ci­sion issues when doing the expo­sure after the image has been ren­dered. I have employed pre-​exposed col­ors with dynamic expo­sure in the past, and I found them tricky to use. Since there is a delay in the expo­sure feed­back (you must know the expo­sure of the pre­vi­ous frame to feed the col­ors for the next frame) you can even get expo­sure oscillation!

Two uses of Vox­els in LittleBigPlanet2’s graph­ics engine
Alex Evans and Anton Kirczenow (MediaMolecule)

Lots of intel­li­gent ideas. Unfor­tu­nately, most of them only really work for the spe­cial case of the 2.5 dimen­sional world that LBP plays in. The best take­away for gen­er­al­iza­tion I think is the dynamic AO.

Mak­ing Game Worlds from Poly­gon Soup
Hao Chen, Ari Sil­ven­noinen, Natalya Tatarchuk (BungiE)

This talk is about the spa­tial data­base orga­ni­za­tion for HALO Reach, and the struc­ture that they set­tled for is the poly­gon soup. All I can say is +1! An unstruc­tured poly­gon soup/​object soup with a spa­tial index on top of it, for exam­ple, a loose octree [1], con­nected to a por­tal sys­tem is imho the way to go to orga­nize 3D data for a dynamic world. I did exactly this in the past and it served me well [3]. Dynamic occlu­sion culling in Greene/​Zhang-​style [2,6] can be added on top of it very eas­ily, and in effect, this is what the Umbra mid­dle­ware pro­vides with soft­ware ren­der­ing. I like the idea of the auto­matic por­tal­iza­tion that is pre­sented in the talk. Manual por­tal­iza­tion is indeed cum­ber­some and error-​prone for the artists, as they cor­rectly state with their out­door exam­ples. Their solu­tion is sim­i­lar to the way nav­i­ga­tion meshes are built, via a flood-​fill of a uni­form grid and cell aggregation.

Secrets of CryENGINE 3 Graph­ics Tech­nol­ogy
Tiago Sousa, Nickolay Kasyan, and Nico­las Schulz (Crytek)

A nice sum­mary of all the lit­tle details that when com­bined, make a great ren­derer. Inter­est­ingly, this is another case for dynamic occlu­sion culling via read­back of the z-​buffer [4]. Do not pass go, do not use occlu­sion queries! Occlusion queries are a mis­nomer, because they’re the least usable for what they were invented for. When I did this, the latency incurred by the z-​buffer read­back was unnot­i­ca­ble on a GeForce2. I did a sim­ple glReadPixels(GL_DEPTH_COMPONENT) over a small view­port of the frame­buffer where the occlu­sion geom­e­try was ren­dered, and it was not slow­ing things down. Today you would have at least some latency, but once the z infor­ma­tion is avail­able on the CPU, it can be used for nice things like deter­min­ing the num­ber of shad­owmap cas­cades to cover the vis­i­ble depth range.
Another thing of note is how the authors describe a tech­nique for blur­ring alpha-​test geom­e­try for hair. It looks very nice but also seems to be a bit expen­sive. For Vel­vet Assas­sin, we used some­thing that I named ‘dual-​blend’. Just use alpha test for the solid parts of the tex­ture, and in a sec­ond pass, alpha-​blend (with an inverted alpha test) just for the pix­els that have an inter­me­di­ate alpha value, while the artists made sure the order­ing of the tri­an­gles is from inner to outer.

More Per­for­mance!
John White (EA Black Box), Colin Barre-​Brisebois (EA Montreal)

The most densely packed and inspir­ing pre­sen­ta­tion of them all, with most of the top­ics com­pletely unique and new. There is Sep­a­ra­ble Bokeh Depth-​of-​Field, a nice tech­nique to make a sep­a­ra­ble blur of uni­form hexag­o­nal shape with as few passes as nec­es­sary. I think it would make sense to com­bine Bokeh and bloom with FFT. This is because the cor­rect shape of the bloom ker­nel is the FFT of the cam­era aper­ture (at least in the far-​field approx­i­ma­tion). So in this case, bloom would have to be a 6-​pointed star.
Then they describe a cool trick, Hi-​Z /​Z-​Cull Reverse Reload. Remem­ber how they told you not to reverse the depth-​comparison dur­ing a frame, because that would inval­i­date Hi-​Z? This is how to do it anyway. Also notable are Chroma Sub-​Sampled Image Pro­cess­ing, and how they imple­mented Tile-​Based Deferred Shad­ing [5].

Dynamic light­ing in God of War 3
Vass­ily Fil­ip­pov (Sony Santa Monica)

This talk is mostly about aggre­gat­ing mul­ti­ple lights in the ver­tex shader, so the pixel shader only has to cal­cu­late a sin­gle spec­u­lar high­light. It then goes to great lengths to make sure the result looks reasonable.

This is inter­est­ing, because I did some­thing sim­i­lar in the past, and I would like to elab­o­rate more on it. I can’t remem­ber that I had the numer­ous prob­lems men­tioned in the talk. How­ever, I did the dif­fuse light­ing entirely per ver­tex which may explain the difference.The data that I inter­po­lated across tri­an­gles was essen­tially this:

half3  HemiDiffuse      : COLOR0;     // Diffuse hemisphere lighting
half3  HemiSpecular     : COLOR1;     // Specular hemisphere lighting
half3  SunDiffuse       : TEXCOORD0;  // Diffuse lighting for the sun
half3  SunH             : TEXCOORD1;  // Half-angle vector for the sun
half3  PointsDiffuse    : TEXCOORD2;  // Agg. diffuse lighting from points
half3  PointsH          : TEXCOORD3;  // Agg. half-angle vector from points
half3  N                : TEXCOORD4;  // Surface normal

The pixel shader did noth­ing more than cal­cu­late the shapes of the spec­u­lar high­lights and mul­ti­ply it with the appro­pri­ate col­ors. The spec­u­lar power and the nor­mal­iza­tion fac­tor was a fixed constant.

// pixel shader:
 
half2 highlights;
highlights.x = pow( saturate( dot( N, SunH ) ), SPEC_POWER );
highlights.y = pow( saturate( dot( N, PointsH ) ), SPEC_POWER );
highlights *= SPEC_NORMFACTOR;
 
half3 result = 0;
 
result += DiffTexture * (
    In.HemiDiffuse + In.SunDiffuse + In.PointDiffuse );
 
result += SpecTexture * (
    In.HemiSpecular +
    In.SunDiffuse * highlights.x +
    In.PointsDiffuse * highlights.y );

Now on to the inter­est­ing thing, the actual aggre­ga­tion. The key was to accu­mu­late the col­ors weighted by cosine term and atten­u­a­tion, but to accu­mu­late the direc­tion weighted by atten­u­a­tion only.

// vertex shader:
 
float3 PointsL = 0;
for( int i = 0; i < NUM_POINTS; ++i )
{
    half3 dx = LightPosition[i] - WorldPosition;
    half attenuation = // some function of distance
    half3 attenL = normalize( dx ) * attenuation;
    Out.PointDiffuse += saturate( dot( N, attenL ) ) * LightColor[i];
    PointsL += attenL;
}
 
Out.PointsH = normalize( PointsL ) + V;

The sum of the weighted L-​vectors is then con­verted to a half-​angle vec­tor and inter­po­lated across the poly­gon. This works because the shader does not do per-​pixel dif­fuse, and instead, the \mathbf{N} \cdot \mathbf{L} term is applied before sum­ma­tion in the ver­tex shader. Oth­er­wise, if the pixel shader would need to do \mathbf{N} \cdot \mathbf{L}, it would no longer be pos­si­ble to use the cosine-​weighted sum of light col­ors, giv­ing rise to all the prob­lems men­tioned in the talk.


[1] Thatcher Ulrich, “Notes on spa­tial par­ti­tion­ing“
http://tulrich.com/geekstuff/partitioning.html
[2] Han­song Zhang, “Effec­tive Occlu­sion Culling for the Inter­ac­tive Dis­play of Arbi­trary Mod­els”, UNC-​Chapel Hill
http://www.cs.unc.edu/~zhangh/hom.html
[3] Chris­t­ian Schüler, “Build­ing a Dynamic Light­ing Engine for Vel­vet Assas­sin”, GDC Europe
http://www.gdcvault.com/free/gdc-europe-09
[4] Stephen Hill, Daniel Colin, “Prac­ti­cal, Dynamic Vis­i­bil­ity for Games”, GPU Pro 2
http://gpupro2.blogspot.com/
[5] Andrew Lau­ritzen, “Deferred Ren­der­ing for Cur­rent and Future Ren­der­ing Pipelines“
http://software.intel.com/en-us/articles/deferred-rendering-for-current-and-future-rendering-pipelines/
[6] Ned Greene, “Hier­ar­chi­cal Z-​Buffer Vis­i­bil­ity“
http://www.cs.princeton.edu/courses/archive/spr01/cs598b/papers/papers.html

2 thoughts on “Comments on Advances in Real Time Rendering 2011

  1. Hi,
    I’d be very inter­ested to know more about com­bin­ing the bloom and bokeh with FFT. Do you have any ref­er­ences on this?

    • Hallo John,
      what I meant was that the shape of the bloom and bokeh ker­nels are related to one another by the Fourier trans­form, which is the far-​field approx­i­ma­tion to dif­frac­tion. So, say, if you were to bloom an image with FFT, you obvi­ously need the FFT-​image of the bloom ker­nel. By law of physics, then, this should auto­mat­i­cally be viewed as the non-​FFT bokeh ker­nel (and vice versa). I remem­ber some authors wrote a paper on this topic around 2005 (+/​-​), for auto­matic gen­er­a­tion of bloom ker­nels from aper­ture geom­e­try, but I can’t find it any­more. Here is a talk on the same sub­ject: http://nae-lab.org/~kaki/paper/PG2004/Kakimoto2004GlarePresen.pdf

Leave a Reply

Your email address will not be published.


seven + 8 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>