Quake 3 fast inverse square root algorithm

2023-01-12

This blog is inspired by this video made by youtuber Nemean explaining the fast inv-sqrt algorithm written for the Quake 3 game.

I want to put what he have said into note form, while simultaneously exploring similar solutions of this algorithm in the Rust language.

This will be in three major sections:
Context | Breakdown | Solution in rust

The original C program

Here is the original c code for the algorithm:


float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;                       // evil floating point bit level hacking
    i  = 0x5f3759df - ( i >> 1 );               // what the fuck? 
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//	y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed

    return y;
}

Context

Section: who cares

purpose

This algorithm is designed to normalise light ray vectors to 1. This is needed to calculate the light physics in game.

Light ray vectors is calculated through:
$https://latex.codecogs.com/gif.image?\inline \LARGE \dpi{110}\bg{black}\sqrt{x^2 + y^2 + z^2}$

And this we want to normalize it down to the range between 0 to 1, we divided each component by the length of the vector, which leads to things like this (we are using x axis as an example):
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}\frac{x}{\sqrt{x^2 + y^2 + z^2}}$

which leads to
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}x \times \frac{1}{\sqrt{x^2 + y^2 + z^2}}$

which leads to the need to calculating the inverse square roots.

Section: Variable decleartion and IEEE 754


float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

The first two variables are bascially set out to represent the number in binary scientific notation form. The variable i is a 32 bit long number. The variable x2 and y are two float(decimal) number

As we know, the binary scientific notation is in the form of:


0 00000000 00000000000000000000000

which consistes of:

1 bit for positive or negative sign (- or +)
8 bit for the exponents (the x in $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}y^x$ )
23 bits called the mantisa is for the two decimal place for the main number (the .xx in 1.xx )

And since we are normalizing a number to the range of 0 to 1, the following is given:

the number will always be positive
the number will be in normalized form

Section: bits and numbers

This section basically pointed out that the bit representaiton of the exponent and mantisa can be roughly equal to its logged version with additional constance. Meaning: the bit representation of x can mostly equal log(x).

The bit representation of mantisa(M) and the exponents(E) can be represented as $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}2^{23}*E + M$ , which in number form is $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}(1+\frac{M}{2^{23}}) * 2^{E-127}$ .

For us to get the log version, we first wrap it with log_2:
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}log_{2}((1+\frac{M}{2^{23}}) * 2^{E-127})$
We then simplify to get
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}log_{2}(1+\frac{M}{2^{23}}) + E - 127$ .

At this point, it is a bit stuck, since we have to figure out how to get rid of log_2 in the front. However, it was found that given a small enough x, $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}log_{2}(1 + x) \approx x$ .
It was also found that if the equation has the smallest error if we add x by the constant of 0.0430.

Hence, we get the following formula when we apply the findings:
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}\frac{M}{2^{23}} + constant+ E - 127$

and after re-arranging we get:
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}\frac{1}{2^{23}}(M + 2^{23} * E) + [constant](0.0430) - 127$

where we can find what we've discovered earlier:
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}2^{23}*E + M$

Breakdown

Section: evil bit hack


i  = * ( long * ) &y;

We want to enable bit manipulation on float. This is necessary when we want to utilize bit manipulation to realize number divisions.

However, we don't want to convert the variable itself from float to long, because that will cause information loss. We want bit representation of the float to become a long itself.

So instead, this line of code converts the address of the float into an address of a long.

Break down

The &y is the address of the float y. The ( long * ) converts the address of float y into the address of a long. So the C language will think it's now a long number livving in that address instead of a float. The * then reads what in the address.

Section WTF


i  = 0x5f3759df - ( i >> 1 );

This line of code attempts to use bit shifting to manipulate the exponent part of a number in order to achive inverse square root.

So bascially, given $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}x^{1}$ .

If you half the exponent (which can easily be done using big shifting), you get $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}x^{\frac{1}{2}}$ .

And if you negate the exponent, you get $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}x^{-\frac{1}{2}}$ , which is what we want .

However, since directly calculating the inverse square root is what we want to avoid through this algorithm, we need to work it out some other way. This is where what we talked about in the previous section comes in. If we get y in the IEEE 754 form, the bit representation of y can be interpreted as the log of itself.

Hence instead of calculating $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}\frac{1}{\sqrt{y}}$ , we calculate $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}log(\frac{1}{\sqrt{y}})$ , which can be turned to $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}-\frac{1}{2}log(y)$ .

note..

there is still a division in the equation, which can be dealt with using bit shift - ( i >> 1 )

Since we want to find out the inverse square root of a number, we can than draw an equation like this:
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}log(\Gamma ) = -\frac{1}{2}log(y)$

We can then use the previously discussed equation $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}\frac{1}{2^{23}}(M + 2^{23} * E) + [constant] - 127$ and estabalish a new equation:
$https://latex.codecogs.com/png.image?\LARGE \dpi{110}\bg{black}log(\frac{1}{2^{23}}(M_{\Gamma} + 2^{23} * E_{\Gamma}) + [constant] - 127) = -\frac{1}{2}log(\frac{1}{2^{23}}(M_{y} + 2^{23} * E_{y}) + [constant] - 127)$ .

After solving for gamma, we get this:

$https://latex.codecogs.com/png.image?\LARGE \dpi{110}\bg{black}M_{\Gamma} + 2^{23} * E_{\Gamma} = \frac{3}{2}2^{23}(127 - constant) - \frac{1}{2}(M_{y} + 2^{23} * E_{y})$
where $https://latex.codecogs.com/png.image?\LARGE \dpi{110}\bg{black} \frac{3}{2}2^{23}(127 - constant)$ = 0x5f3759df


y  = * ( float * ) &i;

Finally we performs the evil bit hack again but in reverse.

Section: Newton iteration


y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//	y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed

This line of code utilized the Newton iteration, we need this becuase the result from the reverse evil bit was still not accurate enough for this calculation.

Newton iteration

Newton iteration finds root of a function. This mean that is finds the x in f(x) = 0 through many iteration of approximation.

Some of the basics of newton iteration involves guessing a new value of x through the f(x) and its derivative. This leads to the following equation
$https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}x_{new} = x - \frac{f(x)}{f^{'}(x)}$

The quake 3 algorithm only did the iteration 1 time because the result already makes the error to be within 1%. The code essentially translates to $https://latex.codecogs.com/gif.image?\inline \huge \dpi{110}\bg{black}f(y) = \frac{1}{y^{2}} - x$ . And when y is the root of the function, y is also the inverse square root of x.

Some solution of this algorithm in rust

Here's a version of this code written in rust found on this stackoverflow post. (rust playground)


#![allow(unused)]
fn main() {
fn inv_sqrt(x: f32) -> f32 {
    let i = x.to_bits();
    let i = 0x5f3759df - (i >> 1);
    let y = f32::from_bits(i);

    y * (1.5 - 0.5 * x * y * y)
}
}