Geostatistical interpolation (Kriging) can be useful in a great number of applications where high fidelity models are required for mapping spatial effects and making predictions based on observations. It is widely utilized in the domain of spatial analysis and computer experiments and heavily used by the US Air Force and GIS services.
The following images by Yang, et. al. provides a visual comparision of twelve different methods – including Kriging Interpolation – against a source geospatial image:
While it has been used largely in the geophysical sciences, Kriging shines in many situations where variation must be considered explicitly and an exact interpolation is needed (i.e., one whose output matches the input data points at points where data is available). For instance, in problems mapping the coverage of wireless transmitters, or potential interference between wireless networks, Kriging has been shown to provide better results than other approaches to coverage mapping. In this way, the problem involves mapping a random field in space using some number of measurements, and making accurate predictions about the coverage in locations that haven’t been measured. This approach is applicable to a great number of geospatial mapping and prediction problems.
This power, however comes at a cost: Kriging is very computationally costly. Each prediction requires solving a quadratic equation in, at worst, O(N^4) time. The cost of making such predictions means that the high-resolution methods of modeling appear inaccessible to big data problems. However, in practice we have found that by parallelizing the most computationally costly parts of Kriging can result in a substantial speedup. The parallel implementation can produce results in reasonable time, on commodity hardware, even for real world data sets. For massive data sets, we can combine low-resolution interpolation (such as nearest neighbor) with high-resolution mapping to produce incremental results in real time.
Kriging Interpolation requires three steps:
- Estimation of the semivariogram
- Semivariogram model fitting
- Expression of the N_TILE x N_TILE solution of the kriging weights and creation of the raster points
The Semivariogram model fitting is easily accomplished with the MAGMA sgetrf() method, which exhibits excellent scaling on GPUs (or Intel Xeon Phi) as shown below:
The expression of the N_TILE x N_TILE solution of the kriging weights and creation of the raster points can be expressed as a simple set of nested loops that calls kriging_results().
for(int i=0; i < TILE_SIZE; i++)
for(int j=0; j < TILE_SIZE; j++)
Since each call to kriging_results() is independent, it is trivial to spread these calls across however many GPUs happen to be in a system. As a result, we can parallelize the most computationally expensive step to achieve a 55x speedup on an NVIDIA K20C per GPU over a quad-core Xeon processor:
- 4-core Xeon: 825 ms
- Nvidia K20c: 15 ms
Utilizing two GPUs results in a 110x speedup, 4 GPUs delivers a 220x speedup and so on. Of course, this same mapping will work with multiple Intel Xeon Phi and other devices.
This work was performed in collaboration with MapLarge and Calib Phillips at smallwhitecube.com.