PLEASE NOTE THIS SITE IS STILL UNDER DEVELOPMENT AND IS MISSING CONTENT
Visit the About Me page for more extensive information on my current projects and experience

Posted: 2016-02-25 00:00:00

Parallax Occlusion Mapping ~ First Attempt

Parallax occlusion mapping (POM) is a form of image-space (/fragment shader based)
displacement. It aims to approximate the visual displacement of bumpy surfaces by using
a displacement map to offset the rendering position of a given pixel on a surface.

Advantages:
- Does not require complex tesselation processes
- Relatively straight forward to achieve
- Can work on any angled surface
- Effect only runs for visible pixels

Disadvantages:
- High end effect, still too slow to achieve in real time (along side other effects) on slower Computers.
- Requires a higher number of samples for more significant displacement.
- Texture distortion when up close (caused by stretching textures)
- Difficult to align along hard edges

Overview

This is my first attempt at creating the effect.
Whilst I am proud to say that I came up with the method myself based on my
understanding of vector projection and spaces, my current implementation
is flawed and I will try to explain why. Nonetheless this is
an exciting start to an effect which has huge potential to
look fantastic in a real game scenario!

My Method

My intuition on how to achieve the effect broke down into a number of
simple steps. Create a new vector basis consisting of using
the surface normal (which points outwards from a surface), then two
orthogonal vectors (that I will call tangent and bitangent) which exist
within the surface of any given polygon.

Once I had this, I then needed to calculate the vector from the eye
to the initial position on the surface of the rendered pixel.

Now comes the fun part, given a displacement map, I assumed that this map would represent
a virtual offset. To do this, I set it up in such a way that black meant 0 displacement,
and white meant 100% displacement, and I could visualise
any flat surface as follows:
Now given this representation, It is time to transform our coordinates into
the this newly defined local space which I will call tangent-space.

Now i've talked about the conversion process, an interesting thing to note, is that
we ALREADY have a coordinate in this exact system, and that is the texture coordinate
of the pixel at that given point. However alone, this is not very useful.
What we want is a means of adding a 3rd dimension to this coordinate system, i.e a depth.
We can achieve this by allowing our transformation mapping to happen as follows:

world space x -> texture_u (tangent space x)
world space y -> texture_v (tangent space y)
world space z -> flipped_screen_space_normal (So that it points down into the surface.) (tangent space z)

Now given a point on the surface, we need to be able to step along the inside
of the surface until our ray intersects the heightmap offset at a given point.
The following process sums up what we want to now achieve:

1) Transform View Vector (Camera to world space pixel coordinate) into tangent space (as defined above).

2) Begin raymarching with a step distance equal to a pre-defined step-size
constant multiplied by our normalized tangent space view vector.

3) Sample depth at each step, compare depth of heightmap to tangent space z.
If sample depth is greater than tangent depth, we have intersected.

4) Return tangent space (x, y).
This will represent our new texture coordinate.

WHERE tangent depth = tangent space z coordinate (0 = on surface of triangle, 1 = maximum offset defined by some scaling constant)

This will give us our desired point of intersection, and thus equate to a new texture coordinate for that pixel.
When repeated for every pixel, we will achieve parallax mapping. It is worth noting however at this stage, we
have only considered a simple implementation which will have a number of artifacts and ultimately require
a large number of maximum samples to achieve good results.

Improving the result

We can improve the result and optimise the process by introducing a simple binary search in our ray march.
As we are stepping in fixed intervals defined by a constant, the precision of our offset is defined by this step size, and the maximum height
we can achieve is defined by the total number of samples we are willing to perform.

For production code, we would ideally want <= 30 samples in total per pixel. We can drastically improve our results by allocating 20 samples
towards our initial reconstruction, and then a further 5 samples for a binary search.

The way this works, is that given a smaller number of samples, we know the true
corrected coordinate lies somewhere within the last band, equal in length to the
maximum possible offset / number of samples.
We also know that the band is equal to this length multiplied by our tangent-space view
vector.

In order to perform a binary search to improve our result, we do the following:

1) Find the tangent space coordinate at the start of that step.

2) Find the tangent space coordinate at the end of the step we are in.

3) Find the midpoint of that band.

4) Perform the same test (i.e Compare sample depth to the heightmap depth).
If sample depth is greater than heightmap depth, we are still "inside" the
texture, so move our result to the midpoint, and repeat binary search using
the FIRST HALF of the step as the input.
Otherwise, we are outside of the texture and have gone too far back,
so keep our result where it is, and repeat the process on the SECOND HALF
of step.

Note how each time, we are dividing the size of the step by 2 meaning we
are giving a drastic increase to precision with few numbers of iterations.
After a number of samples the result will converge:

As you can see, merely 3 samples will put our coordinate almost dead-on where we want it to be.
In practise, the will give the effect of smoothing any banding between results.

Conclusion of my implementation

Whilst I believe that in theory, my implemtation is good, the actual result
is flawed. Whilst the effect works, it does not work correctly along the edges
of geometry. This is because I believe my conversion is incorrect in places and leads to
the heightmap being pulled out of the texture, rather than being pushed in.
This is because when you walk near the edge of geometry, the top height pixels warp,
whereas the pixels with the greatest depth should be the ones that warp and the ones
which closely align with their original position should stay where they are.

My implementation also suffers from another subtle warping artifact in which pixels closer
to the camera appear to curve/ripple. This could be a fault of my initial world-space reconsutrction,

or could perhaps be an issue associated with using a linear depth buffer rather than an exponential one
for that conversion, as the near pixels are proportionally less-precise.

Ultimately, more work will need to be done to improve the result. Performance-wise, the effect works
well and appears to perform reasonably well!