Extracting DEMs from a WCS server
WCS is an interface for retrieving geospatial data. It integrates well with specialised GIS tools like QGIS, but there’s not a lot of information out there about using WCS with custom analytical workflows. So here’s an overview of how you would extract raster DEM data from a WCS server in Python.
It’s worth noting that while WCS is a standard, it mostly specifies how client-server communication is done rather that the schema of the data. Which means that each WCS server may need slightly different handling.
Summary
The basic WCS flow is
- Connect to the WCS server
- Determine the “coverage ID” representing the desired data
- Determine the extent and projection of the data
- Split the extent into tiles
- Query each tile
Steps
You’ll need OWSLib installed to handle the WCS protocol
pip install OWSLib
as well as some standard python tools which may be installed already
pip install numpy requests
Start off by connecting to the WCS server, which is defined by a URL
from owslib.wcs import WebCoverageService
WCS_URL = "https://example./com/ows/wcs"
wcs = WebCoverageService(WCS_URL)
WCS has a concept of “coverages”, which are analogous to a layer or a dataset. List all the server’s coverages with
wcs.contents.keys()
['example_dem']
and select the one you want
coverage_id = "example_dem"
coverage = wcs.contents[coverage_id]
The next piece of information we need is the spatial extent of the data.
coverage.boundingBox
{
'nativeSrs': 'http://www.opengis.net/def/crs/EPSG/0/25833',
'bbox': (228152.0, 5690412.0, 493382.0, 5939023.0)
}
We get a bounding box (in west, south, east, north) order, and a CRS definition. Note how the CRS for this server is specified using a URL rather than an EPSG code or WKT string. If you’re providing your own bounding box (to extract only a subset of the full coverage) make sure it is in the correct CRS.
I find it easy to mess up these unnamed bounding box tuples, so always unpack them instead,
bounds_left, bounds_bottom, bounds_right, bounds_top = coverage.boundingBox['bbox']
For small domains, you can likely query the entire dataset in one go. But for large domains such as this, you’ll need to split up your query into tiles. Choosing the correct tile size will take a bit of guess and check: some servers impose per-query size limits, while others may simply timeout before any limit is reached.
We’ll also define a buffer so that our tiles overlap. This helps avoiding holes, and preserves accuracy when making interpolated raster reads near the edges.
The tile sizes should be in the units of the nativeSrs
of the coverage, and should be integer multiples of the dataset’s resolution.
tile_size = 10000
tile_buffer = 10
Now some numpy magic to define the lower-left corners of our tile grids.
import numpy as np
lefts = np.arange(
bounds_left - bounds_left % tile_size, bounds_right - bounds_right % tile_size, tile_size
)
bottoms = np.arange(
bounds_bottom - bounds_bottom % tile_size, bounds_top - bounds_top % tile_size, tile_size
)
We can do a nested loop over these corners and request each raster.
Some things to note:
- While the query bounds are specified with the “Lat” and “Long” arguments, the query is actually being done in the CRS specified in
subsettingcrs
, even if that CRS isn’t latlon. In other words, think ofLat
asy
andLon
asx
. - To avoid loss of accuracy via repeated reprojection and interpolation, do all the bounds calculation and subsetting the
nativeSrs
. You can always postprocess after exporting the data. - The file is named from the tile left/bottom coordinates for tidiness, though the actual extent of the tile will be buffered slightly larger.
- The code streams the response to disk to avoid storing the potentially large file in memory.
from pathlib import Path
import itertools
import urllib.parse
import shutil
import requests
BASE_FOLDER = "~/wcs-extract-tifs/"
BASE_FOLDER.mkdir()
for left, bottom in itertools.product(lefts, bottoms):
# Where to save the tile.
tif_filename = f"dem_{left}_{bottom}.tif"
tif_path = BASE_FOLDER / tif_filename
# Tile query definition.
params = [
("service", "WCS"),
("version", wcs.version),
("request", "GetCoverage"),
("coverageId", coverage_id),
("format", "image/tiff"),
("subset", f"Lat({bottom-tile_buffer},{bottom+tile_size+tile_buffer})"),
("subset", f"Long({left-tile_buffer},{left+tile_size+tile_buffer})"),
("subsettingcrs", "http://www.opengis.net/def/crs/EPSG/0/25833"),
]
tile_url = WCS_URL + "?" + urllib.parse.urlencode(params)
# Make query.
with requests.get(tile_url, stream=True, timeout=60, verify=False) as r:
r.raise_for_status()
with open(tif_path, "wb") as f:
shutil.copyfileobj(r.raw, f)
At this point your BASE_FOLDER
should be full of .tif
DEMs!
As a final optimisation, I find WCS servers typically do a bad job of compressing tifs. I’d recommend repacking each tif into a compressed COG geotiff with gdal_translate.
Elevation data
Stuck downloading DEMs from a WCS server? We might already have the data you’re looking for in our dataset, check out our coverage map.
We also do consulting work around DEMs and geospatial data management. If that sounds like something you could use, reach out at [email protected]!