Identify certain traces in a datashaded plot


I’m using datashader to plot large amounts of rf measurement traces which makes it very easy to spot outliers visually. Since each trace carries a unique identifier, I was wondering if there is an easy way to get the id’s of all traces in a certain pixel so that these could be selected for further processing.


That’s functionality I’ve wanted to have for a long time, but I haven’t come up with a good way to make it practical to implement either at the library level or for users. What I think you’d do as a user is to use HoloViews streams to return the x,y range of a box selection on the Datashader plot, and you’d then have to filter the original dataset for traces that overlap that region, which could get complicated in practice and will definitely be slow for lines or other non-point shapes.

Datashader itself doesn’t keep track of any metadata about what goes into each pixel, and in general it can’t do that, as arbitrarily many points might land in any particular pixel, requiring unbounded buffers per pixel. It could be feasible to add a Datashader reduction function that keeps track of the last N values of some other column, which would make it bounded. If that sounds useful, please file a feature request at, but it seems tricky to implement. Once it’s in Datashader, HoloViews could then make it available for hover or other uses.

In the meantime, if you look in, you can see one approach that’s already feasible. That plot overlays a decimated version of the dataset on top of the datashaded one, so that when you zoom in far enough the points of interest will (eventually) be hoverable, selectable, etc. That way you can pretend the data is all available for selection, implementing everything as if it were a small dataset, but you can still always see all the data. Maybe that’s good enough for now?

First of all thanks for the extensive answer, appreciate it. I will have a look at the examples.

Holoviews is currently no option for us, because we’re using datashader in matlab plots (had to be integrated into the legacy workflow).
When playing around with datashader yesterday, a few ideas came into my mind:

  1. I was wondering if ds.count_cat could be exploited to to lookup the ids of all traces in a certain pixel. Basically doing a 1x1 pixel aggregate step on a id column in the input data.
  2. Using the histogram equalized aggregated counts to detect outliers, assuming low count pixels are due to outlier traces.

Best Regards