Example using cupy

Main question: Is there any example of geoviews QuadMesh usage with cupy arrays?

More specific:
I started building a diagnostic tool for aircraft / satellite data from imaging instruments. A given flight data fits in a ~200k-400k by ~200 pixels. We already have a process to regrid it accurately, however I want to build a quick look option before it gets to that stage, and the data doesn’t necessarily need to be the most accurately represented there.
To take a specific example I’m looking at a 177203x165 dataset with 3 irregular 2D fields: latitude, longitude, surface_altitude. I divided it into 43 (4121x165) chunks with dask, working with a notebook with a live kernel:

import holoviews as hv
from holoviews.operation.datashader import rasterize
import geoviews as gv
from holoviews.element.tiles import EsriImagery
hv.extension(‘bokeh’)

assume code reading in the longitude, latitude, and surface altitude into lon, lat, and surfalt

dalon = da.from_array(lon,chunks=(4121,165))
dalat = da.from_array(lat,chunks=(4121,165))
dasurfalt = da.from_array(surfalt,chunks=(4121,165))
dasurfalt_quad = gv.project(gv.QuadMesh((dalon,dalat,dasurfalt)))
surfalt_plot = rasterize(dasurfalt_quad,width=450,height=450,precompute=True).opts(width=450,height=450,cmap=“viridis”,colorbar=True,title=“Surface Altitude (km)”)
EsriImagery*surfalt_plot

The plot shows up but after panning/zooming the raster can take half a minute to update.I wanted to see if there are ways to make the updates more fluid. It’s my first dive into cupy, do I just need to declare cupy arrays instead of dask arrays to work on gpus? I use rasterize as I’d rather keep the bokeh colorbar.

Maybe this?

Also maybe relevant?
https://projectpythia.org/kerchunk-cookbook/notebooks/case_studies/Streaming_Visualizations_with_Hvplot_Datashader.html

1 Like

Thanks, looks like a similar issue in the second link when there a few second delay for the data to update after zooming in/out, in my case the delay is approaching 30-60 seconds. They show that opening the datasets using zarr engine seems to speed things up. But I would like to explicitly use cupy arrays.

So I tried using cupy arrays following my first example

import cupy as cp
cpsurfalt = cp.asarray(surfalt)
cplon = cp.asarray(lon)
cplat = cp.asarray(lat)
cpsurfalt_quad = gv.project(gv.QuadMesh((cplon,cplat,cpsurfalt)))

But I get this error:

DataError: None of the available storage backends were able to support the supplied data format.

According to this Performance — Datashader v0.16.0 datashader should work with “Xarray+cupy” for quadmeshes. What is wrong in my usage?

When I try using datashader API only I get a different error:

import datashader as ds
from datashader import transfer_functions as tf
import cupy_xarray
import xarray as xr
cpxr = xr.DataArray(cpsurfalt, name=‘surfalt’, dims = [‘y’, ‘x’],
coords={‘lon’: ([‘y’, ‘x’], cplon),
‘lat’: ([‘y’, ‘x’], cplat)})
canvas = ds.Canvas()
tf.shade(canvas.quadmesh(cpxr,x=“lon”,y=“lat”))

I then get:

TypeError: Implicit conversion to a NumPy array is not allowed. Please use .get() to construct a NumPy array explicitly.

The code works when replacing cupy arrays with numpy arrays. I have an available cuda device and tested simple cupy operations and they are indeed faster than the corresponding numpy operations, which makes me think I’m not using the datashader/holoviews apis correctly?

Just in case my example isn’t convenient as it’s not a full code. Here I reproduce the error using the irregular meshes example from https://holoviews.org/reference/elements/bokeh/QuadMesh.html:

import holoviews as hv
import geoviews as gv
import cupy as cp
hv.extension('bokeh')

n=20
coords = np.linspace(-1.5,1.5,n)
X,Y = np.meshgrid(coords, coords);
Qx = np.cos(Y) - np.cos(X)
Qy = np.sin(Y) + np.sin(X)
Z = np.sqrt(X**2 + Y**2)

qmesh = gv.QuadMesh((Qx, Qy, Z))

However the following doesn’t work:

cp_qmesh = gv.QuadMesh((cp.asarray(Qx), cp.asarray(Qy), cp.asarray(Z)))

And produces this error:

DataError: None of the available storage backends were able to support the supplied data format.