How to Raytracing

What's remarkable about the above scene is how it is produced. Unlike most renditions of three dimensional scenes, this one is an example of raytracing. Even better, the ray tracing is done in real time, on your machine, in your browser. This is made possible by the extreme parallelization capabilities of modern gpus, and the ease with which they can be programmed to using shaders.

A benefit of raytraced rendering is a lack of polygons. The limiting factor in displaying a smooth curve like a sphere's edge will be the pixels.

For comparison, this is what passed for a round pipe in 1996.

This way of representing objects has a certain nostalgic charm to it. No doubt. Constructing a virtual world from scratch out of triangles alone is an art form.

But for a more direct sampling of the underlying shapes, raytracing bypasses the smoke and mirrors of rasterization, zbuffering, etc. Zoom in as much as you like, the structure doesn't fail. The math behind the shape and the lighting of it are achieved every pixel, every frame.

No surprise then that to understand how raytracing works, you need a little math.

From this point of view you can see the raytracing works. It all boils down to answering one question: How far would you have to travel in a specific direction to hit the ball?

The rest is details. If you can answer that question, you can render a sphere perfectly. No vertices, no corners. We can answer that question, and the secret sauce is the old highschool quadratic formula. Remember that?

But to do this in three dimensions requires vectors. These vectors:

$\vec d$ is the direction vector of our line segment of length $\color{ache} t$ extending from the camera $\color{pspice} \vec o$ through a pixel at location $\color{flour} \vec p$ to hit a point $\vec x$ on the sphere centered at $\color{groan}\vec s$ with radius $\color{bleen} r$. Note that of those variables, in practice we only need to calculate $\color{ache} t$ and $\vec x$. The rest driven known by user or the environment of the program. And $\color{cyan}\vec d$ is just $\color{flour}\vec p - \color{pspice}\vec o$ normalized.

The sphere is defined as all the points x that are a distance $r$ away from $\color{groan}\vec s$, mathematically that's given by:

$|| \vec x - \color{groan}\vec s \color{white}||^2 = \color{bleen}r\color{white}^2$

The equivalent definition for the ray is:

$\vec x = \color{pspice}{\color{pspice}\vec o} \color{white} + \color{cyan}\vec d \color{ache}t$

$\vec x$ is what needs to be solved for, but not directly. First we'll find the length $\color{ache} t$.

Combining the equations for the sphere and line by substituting $\vec x$ we get

$|| \color{pspice}{\vec o} \color{white} + \color{cyan}\vec d \color{ache}t \color{white} - \color{groan}\vec s \color{white} ||^2 = \color{bleen}r\color{white}^2$

$(\color{pspice}\vec o \color{white}- \color{groan}\vec s \color{white})^2 + 2( \color{pspice}{\vec o} \color{white}- \color{groan}\vec s\color{white}) \color{cyan}\vec d \color{ache}t \color{white}+ \color{cyan}\vec d\color{white}^2 \color{ache}t\color{white}^2 - \color{bleen}r\color{white}^2 = 0$

Which is the quadratic equation we know how to solve for $\color{ache} t$, perhaps more easily seen if we rearrange things a little.

$\color{cyan}\vec d\color{white}^2 \color{ache}t\color{white}^2 + 2(\color{pspice}\vec o\color{white}-\color{groan}\vec s\color{white})\color{cyan}\vec d \color{ache}t\color{white} + (\color{pspice}{\vec o}\color{white} - \color{groan}\vec s\color{white})^2 - \color{bleen}r\color{white}^2 = 0$

In terms of the quadratic formula's abc's,

$\begin{align} a & = \color{cyan}\vec d \color{white}^2 = 1 \\b & = 2(\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d \\ c & = (\color{pspice}\vec o\color{white} - \color{groan}\vec s\color{white})^2 - \color{bleen}r\color{white}^2 \end{align} $

Because $\color{cyan}\vec d$ is a unit vector of length 1, $\color{cyan}\vec d\color{white}^2= 1$ and can be ignored. Thus the solution for $\color{ache}t$ is:

$\color{ache}t\color{white} = {-2(\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d \color{white} \pm \sqrt{(2(\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d\color{white})^2 - 4 ((\color{pspice}\vec o\color{white} - \color{groan}\vec s\color{white})^2 - \color{bleen}r\color{white}^2) }}/2$

$\color{ache}t\color{white} = -(\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d\color{white} \pm \sqrt{ ((\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d\color{white})^2 -(\color{pspice}\vec o\color{white} - \color{groan}\vec s\color{white})^2 + \color{bleen}r\color{white}^2 }$

A line may intersect may intersect a sphere once, twice, or not at all. The radicand $((\color{pspice}\vec o\color{white} - \color{groan} \vec s\color{white}) \color{cyan}\vec d\color{white})^2 -(\color{pspice}\vec o\color{white} - \color{groan}\vec s\color{white})^2 + \color{bleen}r\color{white}^2$ dictates which is the case. If the radicand is negative there are no interceptions, positive indicates two interceptions, and zero for exactly one intersection.

That may seem like a lot of math to answer a single question, but really most of it is just a justification for how little code we can get away with:



    //a function to determine where a ray from point p with direction
    //d would intersect a sphere defined by a vec4

    float intersectSphere( in vec3 p, in vec3 d, in vec4 sph)
    {   
        //a sphere that swings back and forth, with a radius of 10
        vec4 sph = vec4(30.0*sin(sin(time)), 0.0, 60.0-30.0*cos(sin(time)), 10.0);

        vec3 ps = p - sph.xyz;                  //p - s
        float b = 2.0 * dot(ps, d);
        float c = dot(ps, ps) - sph.w*sph.w;    //sph.w is the radius of the sphere
        float h = b*b - 4.0*c;                  //h the radicand to test
        if ( h<0.0) return -1.0;                //-1 represents no intersection
        float l = (-b - sqrt(h))/2.0;
        return l;
    }

With just that we could determine if the cast ray intersects the sphere or not by inspecting whether $\color{ache} t$ is positive or negative. Negative indicates the ray missed the sphere and shot out into infinity. That's enough to determine a pixel should or should be trying to draw a sphere.

But when $\color{ache} t$ is positive, not only do we know that the cast ray does intersect the sphere, we can get around to figuring out where the cast ray intersects the sphere by plugging $\color{ache} t$ back into

$\vec x = \color{pspice}{\color{pspice}\vec o} \color{white} + \color{cyan}\vec d \color{ache}t$

With the location known we can do a lot. We could determine which way the surface of the sphere was facing at that location, and how that compares to some pretend lights.

Traditional rendering with polygons figures out which way each triangle is facing, and stores them for looking up later. The triangles facing directly at the light vector get lit up brightly, and those that face away not so much. The result depicted is called flatshading -- you can easily make out the edges of the polygons that make up the surface. There are tricks for fudging and smoothing these directions so that it looks better, but with raytracing we're not going to need any of those.

Because the ray could hit the sphere anywhere, we need a way of getting the exact direction anywhere on the surface.

These "surface directions" are often called normal vectors, $\vec{n}$, we can just subtract the center of the sphere, $\vec{s}$ from the point on the surface $\vec{x}$. And because we're only attempting to represent direction we normalize it -- make sure that the length of $\vec{n}$ is really exactly perfectly 1 for math happys.


    vec3 n = normalize( x - s );

Mathematically, we need some kind of function or trick that communicates how closely one things points at another.


    float litness = theOppositeOf howAlikeAreTheseDirections( surfaceDirection, lightDirection );

Thankfully, the mathists of beforetimes defined just such a function, the dot product, and then nice graphics hardware specification specifiers decided to include this dot product as built-in function for shader code. It's like they knew what we would want before we did!


    float litness = - dot( n, l );

Above we had already normalized the normal vector, $\color{black}{\vec n}$, so it definitely had a length of 1. If we're careful to define the light as vector of length 1, then the dot product will max out at 1 if they're pointing in exactly the same direction, and bottom out at -1 if they're pointing completely opposite, with 0 representing perpendicular vectors. We negate that value because we're more interested in when the two vectors are pointing at each other.


  void main(){
    vec4 sph = vec4(
      30.0*sin(sin(time)), 
      0.0, 
      60.0-30.0*cos(sin(time)), 10.0);

    vec3 light = normalize( vec3(1.0, 0.0, 0.0));
    vec3 d = normalize( surfPos - camPos );
    float t = intersectSphere( camPos, d, sph );
    vec4 col = vec4( 0.1, 0.1, 0.1, 1.0 );

    if ( t > 0.0 ) { 
      col = vec4(0.3, 0.6, .7, 1.0); 
      vec3 x = camPos + t*rd;
      vec3 nor = normalize(x - sph.xyz);
      float dif = clamp(-dot(nor, light), 0.0, 1.0);
      col = vec4(0.3, 0.6, .7, 1.0)*dif;
    }
    gl_FragColor = col;

  }


  void main(){
    vec4 sph = vec4(
      30.0*sin(sin(time)), 
      0.0, 
      60.0-30.0*cos(sin(time)), 10.0);
 

    vec3 d = normalize(surfPos - camPos);
    float t = intersectSphere( camPos, d, sph );
    vec4 col = vec4( 0.1, 0.1, 0.1, 1.0 );

    if ( t > 0.0 ) { 
      col = vec4(0.3, 0.6, .7, 1.0); 
    }
    gl_FragColor = col;
  }

Spheres are a good start, but what about something more complicated?

The above hyperboloid and ellipsoid combo are rendered at a lower resolution because the raytracing techniques required are a bit heavier. Where we could find the distance to a point on the spheres with a single function call, these more interesting shapes are defined a bit differently. Instead each ray fumble along, like a person stumbling in the dark, until they bump into one of the shapes. Each ray could take hundreds of steps, where the sphere definitely only took one! Plus, the way shaders work, even if a ray is going to bumble just a few steps before it hits a shape we don't really see any time saved. The gpu has to coordinate a lot of the these ray-shader programs to run at once, predicting how long each will take before running them, so it assumes the worst.