proper/best type for storing latitude and longitude
In a system level programming language like C, C++ or D, what is the best type/encoding for storing latitude and longitude?
The options I see are:
- IEEE-754 FP as degrees or radians
- degrees or radians stored as a fixed point value in an 32 or 64 bit int
- mapping of an integer range to the degree range: ->
deg = (360/2^32)*val
- degrees, minutes, seconds and fractional seconds stored as bit fields in an int
- a struct of some kind.
The easy solution (FP) has the major down side that it has highly non uniform resolution (somewhere in England it can measure in microns, over in Japan, it can't). Also this has all the issues of FP comparison and whatnot. The other options require extra effort in different parts of the data's life cycle. (generation, presentation, calculations etc.)
One interesting option is a floating precision type that where as the Latitude increase it gets more bits and the Longitude gets less (as they get closer together towards the poles).
Related questions that don't quite cover this:
- What is the ideal data type to use when storing latitude / longitudes in a MySQL
- Working with latitude/longitude values in Java
BTW: 32 bits gives you an E/W resolution at the equator of about 0.3 in. This is close to the scale that high grade GPS setups can work at (IIRC they can get down to about 0.5 in in some modes).
OTOH if the 32 bits is uniformly distributed over the earth's surface, you can index squares of about 344m on a side, 5 Bytes give 21m, 6B->1.3m and 8B->5mm.
I don't have a specific use in mind right now but have worked with this kind of thing before and expect to again, at some point.
Solution 1:
The easiest way is just to store it as a float/double in degrees. Positive for N and E, negative for S and W. Just remember that minutes and seconds are out of 60 (so 31 45'N is 31.75). Its easy to understand what the values are by looking at them and, where necessary, conversion to radians is trivial.
Calculations on latitudes and longitudes such as the Great Circle distance between two coordinates rely heavily on trigonometric functions, which typically use doubles. Any other format is going to rely on another implementation of sine, cosine, atan2 and square root, at a minimum. Arbitrary precision numbers (eg BigDecimal in Java) won't work for this. Something like the int where 2^32 is spread uniformly is going to have similar issues.
The point of uniformity has come up in several comments. On this I shall simply note that the Earth, with respect to longitude, isn't uniform. One arc-second longitude at the Arctic Circle is a shorter distance than at the Equator. Double precision floats give sub-millimetre precision anywhere on Earth. Is this not sufficient? If not, why not?
It'd also be worth noting what you want to do with that information as the types of calculations you require will have an impact on what storage format you use.
Solution 2:
Longitudes and latitudes are not generally known to any greater precision than a 32-bit float. So if you're concerned about storage space, you can use floats. But in general it's more convenient to work with numbers as doubles.
Radians are more convenient for theoretical math. (For example, the derivative of sine is cosine only when you use radians.) But degrees are typically more familiar and easier for people to interpret, so you might want to stick with degrees.
Solution 3:
A Decimal representation with precision of 8 should be more than enough according to this wikipedia article on Decimal Degrees.
0 decimal places, 1.0 = 111 km
...
7 decimal places, 0.0000001 = 1.11 cm
8 decimal places, 0.00000001 = 1.11 mm
Solution 4:
http://www.esri.com/news/arcuser/0400/wdside.html
At the equator, an arc-second of longitude approximately equals an arc-second of latitude, which is 1/60th of a nautical mile (or 101.27 feet or 30.87 meters).
32-bit float contains 23 explicit bits of data.
180 * 3600 requires log2(648000) = 19.305634287546711769425914064259 bits of data. Note that sign bit is stored separately and therefore we need to amount only for 180 degrees.
If you normalize the value 648000 to some power of 2 then the following calculation applies.
After subtracting from 23 the bits for log2(648000) we have remaining extra 3.694365712453288230574085935741 bits for sub-second data.
That is 2 ^ 3.694365712453288230574085935741 = 12.945382716049382716049382716053 parts per second.
Therefore a float data type can have 30.87 / 12.945382716049382716049382716053 ~= 2.38 meters precision at equator.
The above calculation is precise in case you normalize the 180 degrees value to some power of 2. Else assuming that sub-degree precision is stored after the decimal point, the floating point representation will physically use all 8 bits for the degrees part. That leaves 15 bits for the sub-degree precision. Then 15 - log2(3600) makes 3.1862188087829629413518832531256 bits for sub-second data, or 3.3914794921875 ~= 3.39 meters precision at equator. That is about a meter less than normalization would have provided.
Solution 5:
Might the problems you mentioned with floating point values become an issue? If the answer is no, I'd suggest just using the radians value in double precision - you'll need it if you'll be doing trigonometric calculations anyway.
If there might be an issue with precision loss when using doubles or you won't be doing trigonometry, I'd suggest your solution of mapping to an integer range - this will give you the best resolution, can easily be converted to whatever display format you're locale will be using and - after choosing an appropriate 0-meridian - can be used to convert to floating point values of high precision.
PS: I've always wondered why there seems to be no one who uses geocentric spherical coordinates - they should be reasonably close to the geographical coordinates, and won't require all this fancy math on spheroids to do computations; for fun, I wanted to convert Gauss-Krüger-Koordinaten (which are in use by the German Katasteramt) to GPS coordinates - let me tell you, that was ugly: one uses the Bessel ellipsoid, the other WGS84, and the Gauss-Krüger mapping itself is pretty crazy on it's own...