Mean array scaling in Python
scikit-image’s resize_local_mean function is an easy way to rescale a 2D numpy array, when you want the value of the new pixels to represent the mean of the pixels they were covering:
import numpy as np
import skimage.transform
# Small test array.
x = np.linspace(0, 3, 7)
y = np.linspace(0, 1, 5)
xx, yy = np.meshgrid(x, y)
a = xx + yy
new_shape = (3, 4)
# Do the resize.
a_resized = skimage.transform.resize_local_mean(a, new_shape, grid_mode=True)
However, resize_local_mean
runs very slowly on large arrays.
OpenCV has a similar algorithm that scales much better to large arrays! The function just becomes a little more complicated:
- The
dsize
argument is reversed compared to the numpy ecosystem. cv2.resize
expects a normalised array with values between 0 and 1.- The normalisation also means checking for an ill-conditioned array.
import cv2
def opencv_resize_local_mean(a: np.ndarray, new_shape: tuple[int, int]) -> np.ndarray:
a_min = np.min(a)
a_max = np.max(a)
# Early exit to avoid divide by zero.
if a_min == a_max:
return np.zeros(new_shape) + a_min
# Rescale the array to [0, 1].
a_norm = (a - a_min) / (a_max - a_min)
a_norm = a_norm.astype(np.float32)
# Do the resizing.
new_shape_r = [new_shape[1], new_shape[0]]
res_norm = cv2.resize(a_norm, new_shape_r, interpolation=cv2.INTER_AREA)
# Unscale the array.
res = res_norm * (a_max - a_min) + a_min
assert np.array_equal(res.shape, new_shape)
For a test resizing of a 10,000 px DEM to 3,600 px, the average error between the two solutions is small. The max error was about 2 mm.
diff = res_skimage - res_cv2
np.mean(diff): 1.108501658070558e-05
np.median(diff): -7.279990974495831e-07
np.max(diff): 0.01960058534207576
np.min(diff): -1.3228989928393275e-05
The runtime is a lot better also for large arrays:
5,000px -> 3600px
resize_local_mean 3.482 [s]
cv2_area 0.229 [s]
10,000px -> 3,600px
m_resize_local_mean 11.541 [s]
cv2_area 0.743 [s]
20,000px -> 3,600px
m_resize_local_mean 33.303 [s]
cv2_area 2.822 [s]
And finally, the CPU usage is a lot lower also. While the latency of a single invocation is about 10x faster with OpenCV, if you’re saturating your CPU running these calculations (as I was!) I found my throughput increased by about 50x.
DEMs as a service
GPXZ’s raster API lets you specify your desired bounds and resolution: we do the resizing and warping for you!
If you need high-quality elevation DEMs, get in touch at [email protected]