Extracting DEMs from a WCS server

WCS is an interface for retrieving geospatial data. It integrates well with specialised GIS tools like QGIS, but there’s not a lot of information out there about using WCS with custom analytical workflows. So here’s an overview of how you would extract raster DEM data from a WCS server in Python.

It’s worth noting that while WCS is a standard, it mostly specifies how client-server communication is done rather that the schema of the data. Which means that each WCS server may need slightly different handling.

Summary

The basic WCS flow is

  • Connect to the WCS server
  • Determine the “coverage ID” representing the desired data
  • Determine the extent and projection of the data
  • Split the extent into tiles
  • Query each tile

Steps

You’ll need OWSLib installed to handle the WCS protocol

pip install OWSLib

as well as some standard python tools which may be installed already

pip install numpy requests

Start off by connecting to the WCS server, which is defined by a URL

from owslib.wcs import WebCoverageService

WCS_URL = "https://example./com/ows/wcs"
wcs = WebCoverageService(WCS_URL)

WCS has a concept of “coverages”, which are analogous to a layer or a dataset. List all the server’s coverages with

wcs.contents.keys()
['example_dem']

and select the one you want

coverage_id = "example_dem"
coverage = wcs.contents[coverage_id]

The next piece of information we need is the spatial extent of the data.

coverage.boundingBox
{
    'nativeSrs': 'http://www.opengis.net/def/crs/EPSG/0/25833',
    'bbox': (228152.0, 5690412.0, 493382.0, 5939023.0)
}

We get a bounding box (in west, south, east, north) order, and a CRS definition. Note how the CRS for this server is specified using a URL rather than an EPSG code or WKT string. If you’re providing your own bounding box (to extract only a subset of the full coverage) make sure it is in the correct CRS.

I find it easy to mess up these unnamed bounding box tuples, so always unpack them instead,

bounds_left, bounds_bottom, bounds_right, bounds_top = coverage.boundingBox['bbox']

For small domains, you can likely query the entire dataset in one go. But for large domains such as this, you’ll need to split up your query into tiles. Choosing the correct tile size will take a bit of guess and check: some servers impose per-query size limits, while others may simply timeout before any limit is reached.

We’ll also define a buffer so that our tiles overlap. This helps avoiding holes, and preserves accuracy when making interpolated raster reads near the edges.

The tile sizes should be in the units of the nativeSrs of the coverage, and should be integer multiples of the dataset’s resolution.

tile_size = 10000
tile_buffer = 10

Now some numpy magic to define the lower-left corners of our tile grids.

import numpy as np

lefts = np.arange(
    bounds_left - bounds_left % tile_size, bounds_right - bounds_right % tile_size, tile_size
)

bottoms = np.arange(
    bounds_bottom - bounds_bottom % tile_size, bounds_top - bounds_top % tile_size, tile_size
)

We can do a nested loop over these corners and request each raster.

Some things to note:

  • While the query bounds are specified with the “Lat” and “Long” arguments, the query is actually being done in the CRS specified in subsettingcrs, even if that CRS isn’t latlon. In other words, think of Lat as y and Lon as x.
  • To avoid loss of accuracy via repeated reprojection and interpolation, do all the bounds calculation and subsetting the nativeSrs. You can always postprocess after exporting the data.
  • The file is named from the tile left/bottom coordinates for tidiness, though the actual extent of the tile will be buffered slightly larger.
  • The code streams the response to disk to avoid storing the potentially large file in memory.
from pathlib import Path
import itertools
import urllib.parse
import shutil

import requests


BASE_FOLDER = "~/wcs-extract-tifs/"
BASE_FOLDER.mkdir()


for left, bottom in itertools.product(lefts, bottoms):
    # Where to save the tile.
    tif_filename = f"dem_{left}_{bottom}.tif"
    tif_path = BASE_FOLDER / tif_filename

    # Tile query definition.
    params = [
        ("service", "WCS"),
        ("version", wcs.version),
        ("request", "GetCoverage"),
        ("coverageId", coverage_id),
        ("format", "image/tiff"),
        ("subset", f"Lat({bottom-tile_buffer},{bottom+tile_size+tile_buffer})"),
        ("subset", f"Long({left-tile_buffer},{left+tile_size+tile_buffer})"),
        ("subsettingcrs", "http://www.opengis.net/def/crs/EPSG/0/25833"),
    ]
    tile_url = WCS_URL + "?" + urllib.parse.urlencode(params)

    # Make query.
    with requests.get(tile_url, stream=True, timeout=60, verify=False) as r:
        r.raise_for_status()
        with open(tif_path, "wb") as f:
            shutil.copyfileobj(r.raw, f)

At this point your BASE_FOLDER should be full of .tif DEMs!

As a final optimisation, I find WCS servers typically do a bad job of compressing tifs. I’d recommend repacking each tif into a compressed COG geotiff with gdal_translate.

Elevation data

Stuck downloading DEMs from a WCS server? We might already have the data you’re looking for in our dataset, check out our coverage map.

We also do consulting work around DEMs and geospatial data management. If that sounds like something you could use, reach out at [email protected]!