Probably more “can’t” than “won’t” but whatevs

There’s a D3D error that’s been discussed in a thread on GameDev.net involving a crash when:

  • attempting to switch to fullscreen mode
  • at a resolution different from the desktop
  • on Vista
  • with an Nvidia card
  • while using the D3D debug runtime

I got bit by this bug today. It crashes on a call to IDirect3DDevice9::Reset() with the error message:
Direct3D9: (ERROR) :Lost due to display uniqueness change
The preceding call to IDirect3DDevice9::TestCooperativeLevel() returns D3D_OK, so I’m not sure what more I could do to make sure the device is in a good state to switch screen resolutions.

After bashing my head against the wall for a while trying to come up with a solution, I decided to run a sanity check. I left the D3D debug runtime enabled with breaking on errors, and I fired up some commercial games to see whether I could reproduce this crash. Of the six games I tested, three crashed immediately in this exact case (trying to go to a resolution different from the desktop in fullscreen mode). That’s good enough for me. I’m filing this one this under “WON’T FIX.”

Oh W divide, you so crazy

Does it make me a bad coder that I only just today realized that the W component of a vertex post-projection is actually its exact depth in world units? I’m not sure how or why I never knew that, but it makes it easy to compute linear depth in a pixel shader, knowing also that the depth of a pixel is computed as (Z/W).

// Standard WVP multiply on a float3 position. Left as-is, depth will be non-linear in the pixel shader.
vs_out.pos = mul(float4(in_pos, 1), wvpMat);

// W is the actual unit depth from the camera
vs_out.pos.z = vs_out.pos.w;

// Subtract off the depth of the near plane
vs_out.pos.z -= NearDepth;

// Scale by the difference between the near and far planes.
// This could also be done as a single multiply if 1.0f / (FarDepth - NearDepth) were saved off.
vs_out.pos.z /= (FarDepth - NearDepth);

// And finally, multiply by W in anticipation of the automatic per-pixel W divide before the pixel shader.
// This step should only be done when calculating linear depth for the POSITION semantic.
// If linear depth is being written to, e.g., a TEXCOORDn channel, omit this step.
vs_out.pos.z *= vs_out.pos.w;

Presto, linear depth! This is useful for shadow maps and various postprocess effects like depth of field, when the usual front-loaded depth values are insufficient.

I also discovered today that trying to do the W divide in the vertex shader is bad and wrong and bad. Interpolating (Z/W) is not the same as interpolating Z and W separately and then dividing. I was trying to implement motion blur as a postprocess and was seeing artifacts on large triangles, especially when vertices were outside or behind the view frustum. Moving the W divide (for the current and last screen positions, to calculate screen-space velocity from) into the pixel shader fixed these artifacts.

Motion Blur

Motion Blur

A cautionary tale

I recently made a change which improved the perf of my test app’s debug build by something ridiculous like 650%. It turned out I was unwittingly allocating and freeing some memory each frame due to a bug in one of my container classes. So I fixed it, happy with the perf win, and went on my way. But then I started thinking about it later, and I decided to conduct a little experiment. I ran my test app and looked at the framerate. About 1500 fps. I then made a tiny, almost seemingly innocuous one-line change and ran it again. This time I got 500 fps. This one-line change was as follows:

delete new int;

That’s right, a single allocation and deletion of one integer every frame tripled my frame time. Now granted, this was a debug build, with safety checks enabled, so allocations are apt to be slower, but still. The one line took as much time as the entire rest of my frame logic twice over. That’s pretty staggering. In some sense, I suppose it’s fortunate that a single new/delete pair could have such a drastic impact, else I might never have caught this particular bug. Either way, it reinforced a critical point: don’t allocate things every frame!

Twist ending: This entry is actually a cautionary tale against rewriting STL classes. This never would’ve happened if I’d just used std::vector…

A draft that may have preceded my new blog!

It’s been ages since I wrote anything here. Mostly this is because I was shipping an actual game and not working on hobby stuff. Also because I’m lazy but let’s just stick with the first one. But I’ve returned to the hobby coding fray with a renewed interest in generic engine tasks, so I should probably document some of that.

Some of my recent tasks have been to refactor my renderer framework to increase flexibility and reuse, to add a system for end-user-supported localization, to improve and extend the capabilites of my content packaging methods to clearly delineate assets which belong to the engine versus those that belong to the app, and to automatically maintain version numbers for the engine and for apps. Some of these are still works-in-progress and likely not worth discussing yet. Others are larger changes than I want to get into right now.