Identify certain traces in a datashaded plot

p3trus · February 24, 2020, 2:04pm

Hi,

I’m using datashader to plot large amounts of rf measurement traces which makes it very easy to spot outliers visually. Since each trace carries a unique identifier, I was wondering if there is an easy way to get the id’s of all traces in a certain pixel so that these could be selected for further processing.

Thanks

jbednar · February 24, 2020, 9:19pm

That’s functionality I’ve wanted to have for a long time, but I haven’t come up with a good way to make it practical to implement either at the library level or for users. What I think you’d do as a user is to use HoloViews streams to return the x,y range of a box selection on the Datashader plot, and you’d then have to filter the original dataset for traces that overlap that region, which could get complicated in practice and will definitely be slow for lines or other non-point shapes.

Datashader itself doesn’t keep track of any metadata about what goes into each pixel, and in general it can’t do that, as arbitrarily many points might land in any particular pixel, requiring unbounded buffers per pixel. It could be feasible to add a Datashader reduction function that keeps track of the last N values of some other column, which would make it bounded. If that sounds useful, please file a feature request at https://github.com/holoviz/datashader/issues, but it seems tricky to implement. Once it’s in Datashader, HoloViews could then make it available for hover or other uses.

In the meantime, if you look in https://examples.pyviz.org/uk_researchers/uk_researchers.html, you can see one approach that’s already feasible. That plot overlays a decimated version of the dataset on top of the datashaded one, so that when you zoom in far enough the points of interest will (eventually) be hoverable, selectable, etc. That way you can pretend the data is all available for selection, implementing everything as if it were a small dataset, but you can still always see all the data. Maybe that’s good enough for now?

p3trus · February 25, 2020, 11:04am

First of all thanks for the extensive answer, appreciate it. I will have a look at the examples.

Holoviews is currently no option for us, because we’re using datashader in matlab plots (had to be integrated into the legacy workflow).
When playing around with datashader yesterday, a few ideas came into my mind:

I was wondering if ds.count_cat could be exploited to to lookup the ids of all traces in a certain pixel. Basically doing a 1x1 pixel aggregate step on a id column in the input data.
Using the histogram equalized aggregated counts to detect outliers, assuming low count pixels are due to outlier traces.

Best Regards

jbednar · December 13, 2021, 8:00pm

A bit late, but check out ship_traffic redirect for an example of doing this with single points. We plan to add such hit detection for traces as well, but that’s not straightforward given that a given hit might not be near any vertex in that trace. In any case, can’t help with the matlab case, but at least it’s a proof of concept!

geoviz · June 9, 2023, 2:24pm

The ship traffic graphics are really great examples!

jbednar · July 26, 2023, 11:53pm

We are also adding “instant inspection”, partially supported in HoloViews 1.17 (released today) but better supported in 1.18. Instant inspection lets you see which line is which just by hovering, by storing source info per pixel during aggregation. The deployed ship_traffic example uses the same technology, but we don’t have line examples yet.