Faster than a speeding square root

In the data from the Census Bureau, the location of the census blocks are given by latitude and longitude. Instead of degrees, minutes, and seconds, the latitude and longitude are given in millionths of a degree. Fortunately, these are conveniently expressed as 32-bit integers. Finding the geographical distance between two points usually means employing the Pythagorean theorem by the computer. But if fast is desired and sloppy is acceptable, the square root can be avoided. The first method is to simply add a horizontal “X” distance to the vertical “Y” distance to yield an approximate hypotenuse. This first method is three times faster than the Pythagorean theorem, but the error is up to 42%. One reason for the great speed is no need for a squared term, which could overrange the integer. Another reason is that square root requires floating point values.

 For Sandboxwalls in the early stages, all of the census blocks near the district center will be included in the district anyway. The order in which they are included is unimportant. In spite of the sloppiness, the result is the same. In the late stages of the Sandboxwalls algorithm, such sloppiness would cause some census blocks to wind up with a different district. Although the Pythagorean theorem is needed in the end, the Sandboxwalls algorithm can be speeded up by not using square root in the beginning. The trade-off is how much speed is achieved for an allowed sloppiness.

 A second method works as follows: take the shorter of “X” and “Y”, and cut it in half. Add that half to the longer of “X” and “Y” to yield an approximate hypotenuse. Although taking a little more computer time, the error is 0% to 12%. Although taking half is a division operation, an optimized compiler will accomplish this quickly with a shift register operation. Keeping in mind the need to use shift register operations, method three subtracts an additional term. Besides the half of the shorter of “X” and “Y”, a 16th of the shorter is also subtracted. Method three is still almost twice as fast as using the Pythagorean theorem, and the error is 0% to 9.2%.

Method 4 subtracts an eighth instead of a 16th. The speed is the same but the error becomes from -2.8% to about 7%. Method Five is an attempt to get the best of both Method three and method Four. Mostly method five uses method three, but switches to method Four if the angle is large. The error for method Five is 0% to 7%, but the price for the greater accuracy is much less speed.

 I wondered if the quick values obtained by one of these methods could be used as the initial value in the Newton Raphson algorithm for square roots. Unfortunately, this algorithm consumes more computer time using Fortran than the standard optimized Pythagorean theorem.

Here is a chart of my results:

Method 1: long + short integers                                78 time units      error:  0%  to  42%
Method 2: long + half short integers                         94 time units      error:  0%  to  12%
Method 3: long + half short – sixteenth short         156 time units      error:  0%  to  9.2%
Method 4: long + half short – eighth short              156 time units      error: -3%  to   7%
Method 5: long + switch sixteenth,eighth                203 time units      error:  0%  to   7%
Pythagorean: single precision square root               281 time units      assumed accurate

This is the fortran code cut from a test program using method 5:

       i=kx*(longmz-longsf);  j=ky*(latmz-latsf) !-Martinez
       if(i<j) then                             
          if(8*i<7*j) then   !–used 7/8 ratio since 8=2^3
          if(7*i>8*j) then

Another time consuming process is sorting the census blocks. I’ll consider that in my next blog.


Post a Comment

Required fields are marked *


%d bloggers like this: