Tag Archives: C++

Close to deferring the deferred demo post…

Yeah, I was quite close to postpone the writing of this post to tomorrow, but I won’t. ๐Ÿ™‚
The title is humorous but clear, my last demo deals again with deferred shading.

GLSL_deferred with SSAO, iterative parallax mapping and Depth of Field

This time I made every effort to do things right: there is support for directional, point and spot lights and, much more important for performances, lights are rendered only inside their influence volumes using projective texturing.
This means that while directional ones affect everything and have to be rendered as full screen quads, point lights are only rendered as cubes and spot lights as pyramids.

Comparison between disabled and enabled SSAO

This is slightly different from the usual approach taken, which uses spheres and cones but is also a bit more vertex heavy. ๐Ÿ™‚
Rendering bounding shapes should barely affect performance, but makes the check for camera position a bit different.
The check is disabled by default as it is sufficient to cull front faces and invert the depth test to avoid double lighting issue and benefit from some depth optimization.
By enabling the check the application will expose my poor approximation of checking if the camera is inside a pyramid by using a formula meant for a cone, so better if it stays disabled, while the volume is not intersecting the far plane everything should behave as expected… ๐Ÿ˜€

Comparison between normal mapping and iterative parallax mapping

The demo is also a bigger attempt into effects integrations, it will serve as a test bed for a more organized and high level framework.
As a matter of fact it also features Screen Space Ambient Occlusion, Iterative Parallax Mapping and Depth of Field.
Very cool stuff, but I already have a list of things that I would like to implement sooner or later, like a Light Pre-Pass, a couple of advanced shadow mapping techniques, the integration of water rendering, HDR illumination and the reconstruction of position from depth.

Normal rendering and Depth of Field blur

I have published the sources and a video (Vimeo | YouTube), that for the first time comes with a nice soundtrack: it’s Nova Siberia by Big Giant Circles (Jimmy Hinson) from OverClocked ReMix! ๐Ÿ˜‰

Note: I have had a hard time with glc, x264, mencoder and ffmpeg but still YouTube doesn’t accept my video together with the sound, at the moment I have uploaded a mute version.

Gimme some (real-time) water!

Computer generated water has always interested me, since the days of POVRay on Amiga I was trying to simulate it in some way.
Some days ago, while studying another technique, I put everything aside because in that particular moment I felt the urge to implement a water shader. ๐Ÿ˜€

water_crop

I began looking for existing implementations and I found Reimer’s XNA tutorial, a simple approach which I think could be optimized, but that already provides a nice looking water.

The technique is composed of four passes:

  • Rendering the reflection map
  • Rendering the refraction map
  • Rendering the scene
  • Rendering the water plane linearly interpolating the two maps with a Fresnel term

One of the drawbacks is represented by the fact that the whole scene is rendered three times during the first three passes, I’m sure that this procedure could be optimized, but I was lazy enough to cease any further test. ๐Ÿ™‚
Moreover having every pass clearly separated helps with debugging and makes the application capable of displaying them one at a time.

The scene is rendered with parallax mapping enabled, and that is more evident than ever thanks to the new bricks textures, but with an altered shader that also performs user plane clipping, decisive for the first two passes.
Talking about the Fresnel term I have implemented a naive (nothing more than a dot(V, N)) and a better approximation based on the Nvidia’s Fresnel reflection paper.
In the source you will find both but only the first one is actually used, it works better in the scene used in this demo.
You can easily see that waves are fake, the water is composed of just two triangles, the ripple animation is generated by the fragment shader altering texture coordinates based on a normal map and using a time variable.

Of course you can have a look at videos on YouTube (GLSL_water, GLSL_water_HD) or Vimeo (GLSL_water, GLSL_water_HD) and download the sources.

High Dynamic Range galore

In January, during my internship activity, I was researching in the field of HDR imaging, today I had the time, at last, to polish a bit and release the two demos I made at the time.

They both load an RGBE image (the two you see here are courtesy of the Paul Devebec’s Light Probe Image Gallery) through the library of Bruce Walter.

Light probe at different exposure levels (hdr_load1)

Light probe at different exposure levels (hdr_load1)

The first demo implements the technique described in the article High Dynamic Range Rendering published on GameDev.net and is based on five passes and four FBOs:

  1. Rendering of the floating-point texture in an FBO
  2. Down-sampling in a 1/4 FBO and high-pass filter
  3. Gaussian filter along the X axis
  4. Gaussian filter along the Y axis
  5. Tone-mapping and composition

The algorithm is very simple, it first renders the original scene then it extracts bright parts at the second pass, which merely discards fragments which are below a specified threshold:

// excrpt from hipass.frag
if (colorMap.r > 1.0 || colorMap.g > 1.0 || colorMap.b > 1.0)
	gl_FragColor = colorMap;
else
	gl_FragColor = vec4(0.0);

While the third and fourth passes blurs the bright mask, the last one mixes it with the first FBO and sets exposure and gamma to achieve a bloom effect.

// excerpt from tonemap.frag
gl_FragColor = colorMap + Factor * (bloomMap - colorMap);
gl_FragColor *= Exposure;
gl_FragColor = pow(gl_FragColor, vec4(Gamma));
Light probe at different exposure levels (hdr_load2)

Light probe at different exposure levels (hdr_load2)

The second demo implements the technique described in the article High Dynamic Range Rendering in XNA published on Ziggyware and is based on seven passed and more than five FBOs:

  1. Rendering of the floating-point texture in an FBO
  2. Calculating maximum and mean luminance for the entire scene
  3. Bright-pass filter
  4. Gaussian filter along the X axis
  5. Gaussian filter along the Y axis
  6. Tone-mapping
  7. Bloom layer addition

This approach is far more complex than the previous one and is based on converting the scene to its luminance (defined as Y = 0.299*R + 0.587*G + 0.114*B) version, the mean and maximum value can be calculated using a particular downsampling shader and working in more passes, at each one rendering on an FBO with a smaller resolution than the previous until the last pass, when you render the luminance of the entire scene on a 1×1 FBO.

As usual you can have a look at YouTube (GLSL_hdrload1, GLSL_hdrload2) or Vimeo (GLSL_hdrload1, GLSL_hdrload2) videos and download the sources.

A flexible PLY loader for Evolution War r71

I’ve cited Evolution War for the first time on this blog in my previous post, today I want to celebrate my return to SVN committing after a very long time. ๐Ÿ˜€

PLY Export

Revision 71 adds the support for a real Stanford PLY parser and loader, while the one I coded for my graphic class library is very primitive, expecting a hard-coded order for data, this one shouldn’t have any kind of problem with every well-formed PLY file.

For example, while the hard-coded loader can only accept a file like this:

ply
format ascii 1.0
comment Created by Blender3D 249 - www.blender.org
element vertex 4
property float x
property float y
property float z
property float nx
property float ny
property float nz
element face 3
property list uchar uint vertex_indices
end_header
1.000000 2.000000 3.000000 -4.000000 -5.000000 6.000000 
-1.000000 -2.000000 -3.000000 4.000000 5.000000 6.000000 
1.000000 2.000000 3.000000 -4.000000 -5.000000 6.000000 
-1.000000 -2.000000 -3.000000 4.000000 5.000000 6.000000  
3 0 1 2 
3 1 3 2 
3 4 2 1 

the parser loader can load even something like this:

ply
format ascii 1.0
comment Created and shuffled by hand
element face 3
property list uchar uint vertex_indices
element skipme 3
property float skipfirst
property float skipsecond
element vertex 4
property float z
property float nz
property float y
property float ny
property float nx
property float x
end_header
3 0 1 2 
3 1 3 2 
3 4 2 1 
0.0 0.0 
0.0 0.0
0.0 0.0 
3.000000 -6.000000 2.000000 -5.000000 -4.000000 1.000000 
-3.000000 6.000000 -2.000000 5.000000 4.000000 -1.000000 
3.000000 -6.000000 2.000000 -5.000000 -4.000000 1.000000 
-3.000000 6.000000 -2.000000 5.000000 4.000000 -1.000000 

But one of its most important feature resides in the ability to correctly load binary PLY files! ๐Ÿ™‚

Related to it there’s a bug I would like to share with you together with the fix:

istream& istream::read (char* s, streamsize n);
[...]
ifstream ifs;
unsigned int *uIndices;
[...]
ifs.read((char *) uIndices+(j*3), sizeof(unsigned int) * 3);

The read() method only accepts char pointers, so uIndices is casted, but the precedence goes to casting and not to native unsigned int pointer arithmetics, leading to catastrophic effects! ๐Ÿ˜ฎ

The fix was as simple as the bug was subtle:

-ifs.read((char *) uIndices+(j*3), sizeof(unsigned int) * 3);
+ifs.read((char *) (uIndices+(j*3)), sizeof(unsigned int) * 3)

Yet another toon shader

Maybe is true, as I wrote in the README file, that I coded this demo because I felt like the only one who hasn’t yet implemented a toon shader. ๐Ÿ™‚
Actually this is not the only reason, I came with the inspiration when I was presenting the first part of my updated Modern GPUs slides at the university, this time the event was organized by some students and advertised with leaflets. ๐Ÿ˜‰
So, for the second part that will be held next Wednesday, I’m planning to integrate the explanations about the internals of this demo.

From untextured to textured with outlines

From untextured to textured with outlines

It was easy and fast to have a basic toon shader working, thanks to the Lighthouse 3D tutorial.
This version uses a cascade of if-then-else instead of a more usual 1D texture lookup but, judging from the tests I have run, it’s not a performance issue, at least on GeForce 8 and newer cards.

For the edge detecting I wanted to exploit the fragment shader capabilities, working in screen space with the sobel operator and thus being independent from geometric complexity.

The only problem was about *what* to filter.

  1. The first test was straight, I filtered the rendered image, a grey version of the textured and lit MrFixit head, but the results were poor: edge detecting outlined toon lighting shades too.
  2. In the second one I decided to filter the depth buffer, I could get rid of colour to grey conversion but, again, the results were not satisfactory: there were no outlines in the model, just a contour all around.
    Maybe it could have been corrected with a per-model clip planes tuning, but I gave up.
  3. With the third test I filtered out the unilluminated color texture and the results were better. Unfortunately it relied on the presence of a texture and outlined too much details.
  4. I think the fourth approach, as seen in this demo, is the best one.
    I used MRTs to save the eye-space normal buffer during the toon shader pass, then I filtered a grey version of it, outlining the contour plus some other geometric details.

A small note: saving an already grey converted buffer in the toon shader pass speeds up the demo a bit, but storing the normal in a single 8 bits component of the texture causes a loss of precision that leads to some visible artefacts.
Using a floating point texture helps with the precision issue but makes the demo too slow.
Maybe I should try using a single component texture or some kind of RGBA packing algorithm…

As usual you can have a look at YouTube or Vimeo videos and download the sources.

Blurring the parallax

Today I have published the first demo making use of my new C++ class library, I designed it to be very easily ported to a strict GL3 profile or to ES 2.0.

From plain rendering to depth of field

From plain rendering to depth of field

As a matter of fact, it doesn’t make use of fixed pipeline or deprecated functions at all:

  • No immediate mode, only VBOs
  • No use of OpenGL matrix stacks, I have my classes handling transformations and passing matrices to shaders directly
  • No OpenGL lighting, only per-fragment one
  • No quads or polygons, just triangles
Normal versus parallax mapping

Normal versus parallax mapping

I couldn’t release something only to show changes “under the hood”, I had to make something cool, so I decided to mix together parallax mapping (that, as you can see in the screenshot, is a lot more pronounced now) and depth of field, with the little addition of Stanford PLY mesh loading. ๐Ÿ˜€

Mr.Fixit model and maps (the character players portray in Sauerbraten) are courtesy of John Siar, thank you John. ๐Ÿ˜‰

As usual, you can have a look to Vimeo videos (640×480, 1280×720) and download the sources.

Back to work, Mars r622

Yesterday my exams session finally ended, I’m really satisfied about the results achieved, but during the studying period I was eager to get back to coding…

Today I fixed bug #0000008, a fastidious one which caused program termination if the user tried to take a screenshot after having deleted the hidden directory inside his home where settings and images are saved by default. ๐Ÿ™‚

Mars 0.1.1 2nd

The first thing I thought was that I was missing a fopen() return code check, but, fortunately for my reputation, it wasn’t the case. ๐Ÿ˜‰

// Opening output file
if((fp = fopen(filename, "wb")) == NULL)
{
  throw Exception("Screen", "fopen error");
  return -1;
}

The problem, as the shell output was suggesting, was related to the exception system: the fopen() exception was never caught.

Just changing this:

case SDLK_F4:
  screen->TakeScreenshot();
  break;

into this:

case SDLK_F4:
  try
  {
    screen->TakeScreenshot();
  }
  catch(Exception e)
  {
    e.PrintError();
  }
  break;

fixed everything.

Have a look at r622 log and at mars.cpp changes and remember to catch all the exception you may throw. ๐Ÿ˜‰

Mars r594 and the vflip hack

For my first entry on this blog let me tell you a tale, it’s about OpenGL framebuffer and vertical flipping…

Mars 0.1.1 1st

Once upon a time a little fool called Encelo used to perform, in a little testing program, a vertical flip of the entire OpenGL framebuffer this way:

if(_flags & SDL_OPENGL)
{

  GLvoid * pixels;

  pixels = (GLvoid *) malloc(_width * _height * 4);
  glPushAttrib(GL_COLOR_BUFFER_BIT | GL_CURRENT_BIT | GL_PIXEL_MODE_BIT);
  glReadBuffer(GL_FRONT);
  glReadPixels(0, 0, _width, _height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
  glDrawBuffer(GL_BACK);
  glRasterPos2f(-1.0f, 1.0f);
  glPixelZoom(1.0f, -1.0f);
  glDrawPixels(_width, _height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
  glReadBuffer(GL_BACK);
  glReadPixels(0, 0, _width, _height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
  glPopAttrib();
  output_surf = SDL_CreateRGBSurfaceFrom(pixels, _width, _height, 32, _surface->pitch, rmask, gmask, bmask, amask);
}

It wasn’t really bad, it performed some interesting tricks with buffers, and, as a matter of fact, Encelo was really proud of this implementation. ๐Ÿ™‚
But… it didn’t work on Mars. Yes, no matter how much Encelo tested, changed, and tested again, it simply didn’t work on anything else than the original testing program.
A decision had to be taken soon, to persevere or not to persevere? That was the question.

Encelo chose not to persevere and to try a completely different approach… memcpy() flipping! ๐Ÿ˜€
Yeah, something as simple, elegant, fast and smart as this:

if(_flags & SDL_OPENGL)
{
  int row, stride;
  GLubyte * swapline;
  GLubyte * pixels;

  stride = _width * 4; // length of a line in bytes
  pixels = (GLubyte *) malloc(stride * _height);
  swapline = (GLubyte *) malloc(stride);

  glReadPixels(0, 0, _width, _height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);

  // vertical flip
  for(row = 0; row < _height/2; row++)
  {
    memcpy(swapline, pixels + row * stride, stride);
    memcpy(pixels + row * stride, pixels + (_height - row - 1) * stride, stride);
    memcpy(pixels + (_height - row -1) * stride, swapline, stride);
  }

  output_surf = SDL_CreateRGBSurfaceFrom(pixels, _width, _height, 32, _surface->pitch, rmask, gmask, bmask, amask);
}

This story is true, and happened exactly a month ago, on the 30 of November 2006.
As a proof have a look at r594 log and at Screen.cpp changes. ๐Ÿ˜‰