Error creating figure with more Curves than Sets1to3 colors

CrashLandonB · September 30, 2022, 6:52pm

How can I create figures with more Curves than there are colors in the Sets1to3 color set?

The example below creates a generic data set based on the args of the create_generic_data() function. Below you can see args of (3, 22, 10) will create 3 figures with 22 Curves each, with 10 points per curve. It works fine because the 22 curves don’t exceed the 22 colors available in ds.colors.Sets1to3. If you change 22 to any higher value, an error is raised.

I’ve tried using inferno instead of Sets1to3, but I get same error. I’ve tried using Sets1to3*10 to extend the color list, but same error still.

I imagine my options would be different if I wasn’t planning on plotting much more data, such as 100’s of Curves per plot, and using datashader. A reasonable approximation of my big-picture task would be to create a dataset using create_generic_data(10, 300, 100000).

Any suggestions?

import pickle
import holoviews as hv
import panel as pn
from holoviews.operation.datashader import datashade, rasterize, dynspread
import datashader as ds
import ipdb

hv.extension('bokeh')

class MultiPlot:
    """tbd description
    Inputs?
        - data: can be dictionary with two levels of keys.  One plot is created for each first-level key, then subseq.
                keys are plotted within each plot.
        """

    def __init__(self, data, max_plots=6, plot_width=800, plot_height=150, cnorm="eq_hist", line_width=2,
                 pixel_ratio=2):

        # Create a dictionary of all Curves.  Top-level keys define number of subplots.  Secondary-level keys point to
        # each parameter to be plotted in each subplot.  Time indices are aligned to start time.
        curves_dict = {top_key: {sec_key: hv.Curve((sec_value.index.map(lambda x: (x-x[0]).total_seconds()),
                                                    sec_value.values), kdims=['Time'], vdims=['Value'])
                                 for sec_key, sec_value in top_value.items()
                                 }
                       for top_key, top_value in data.items()}

        # Create color key for all second-level keys in data dictionary, and color_points as invisible element to drive
        # legend
        sec_keys = sorted(set([sec_key for primary_key, primary_value in data.items()
                               for sec_key in primary_value.keys()]))
        color_key = zip(sec_keys, ds.colors.Sets1to3)
        color_points = hv.NdOverlay({k: hv.Points([(0,0)], label=str(k)).opts(color=v, size=0) for k, v in color_key})

        # Create list of overlay objects; one per top-level key, with each containing one Curve per secondary-key
        self.overlay = [(datashade(hv.NdOverlay(curves, kdims='k'), line_width=line_width, pixel_ratio=pixel_ratio,
                                   cnorm=cnorm, aggregator=ds.by('k', ds.count())
                                   )*color_points).opts(width=plot_width, tools=['hover']).relabel(top_key)
                        for top_key, curves in curves_dict.items()]
        # Create Layout to plot each Overlay figure in a single column
        self.layout = hv.Layout(self.overlay).cols(1)

    def show(self):
        pn.serve(self.layout)


def create_generic_data(level1_parm_count, level2_parm_count, parm_sample_count):
    """Generates generic data dictionary"""
    import pandas as pd
    l1_parms = [f'L1P_{str(i)}' for i in range(level1_parm_count)]
    l2_parms = [f'L2P_{str(i)}' for i in range(level2_parm_count)]
    return {l1p: {l2p: pd.Series([j**int(l2p[-1]) for j in range(parm_sample_count)],
                                 index = pd.to_timedelta(range(parm_sample_count), unit='s'))
                  for l2p in l2_parms} for l1p in l1_parms}



if __name__ == '__main__':
    
    # # Pull data from file
    # with open('data_cache.pkl', 'rb') as data_cache:
    #     data = pickle.load(data_cache)

    # Optional creation of generic data set for troubleshooting. Args are (# of top keys, # of sec. keys, and # of 
    # points per Curve
    data = create_generic_data(3, 22, 10)

    myplot = MultiPlot(data)
    myplot.show()

jbednar · October 10, 2022, 5:57pm

Humans aren’t great at distinguishing between lots of similar colors, but if you do need lots of distinct colors you can use colorcet’s various 256-color Glasbey color sets (Categorical — colorcet v3.0.1), and you can even get 512 colors if you concatenate glasbey_light and glasbey_dark or glasbey_cool and glasbey_warm. If even that’s not enough you can generate arbitrarily many colors if you follow the same techniques we did to generate those color keys in the first place; see the relevant PRs on colorcet. Beyond a few dozen colors, though, I think it’s better to color by some meaningful category or quantity rather than just hoping for distinct colors (as in the ship_traffic redirect demo), or even just to use datashader to show the entire distribution using a colormap by pixel density rather than a color key by identity, then use hover or other mechanisms to show identity.

CrashLandonB · October 10, 2022, 9:16pm

Thanks for the input. I concur that the more colors used, the less effective they will be. And categorization-by-color may be a feature at some point. But my intent is to create a plotting method that is hopefully not going to be frequently used for more than, say, 20 signals, but I want it to be robust enough to handle a few hundred, if I can.

And I do like the approach of colormapping by pixel density (with hover for identity). I was thinking about providing that as a user-specified option. I’m not sure how to enable hover for curve identity, with datashader, so I need to research that.

jbednar · October 10, 2022, 9:33pm

Direct support for curve identity hover is not yet implemented but is scheduled to be worked on soon; see Directly support HoloViews-style inspect operations · Issue #1126 · holoviz/datashader · GitHub . Indirect support was implemented in Add inspect_curve operation by philippjfr · Pull Request #5240 · holoviz/holoviews · GitHub but still isn’t merged. In the meantime, I think the only practical way is to use inspect_points as in ship_traffic redirect, possibly after overlaying points on top of your curves (making them hoverable only at vertex locations). Hopefully we can implement the approach in #1126 soon!

CrashLandonB · October 11, 2022, 4:04pm

Good to know. Thank you.

So if I change my approach to not use a colormap, so every Curve is same color by aggregates by pixel density, I tried simply removing *color_points from the above computation of self.overlay, and I removed the definition of sec_keys, color_key, and color_points to be safe, I still get an error for Insufficient colors provided. Can I plot data as shown in Timeseries — Datashader v0.14.2 (under “Plotting Large Numbers of Time Series Together”), but with Holoviews so I can zoom etc?

(note, in the above code also have to change the line data = create_generic_data(3, 22, 10) to something with greater than 22 in order to exceed the size of the standard color map. Change to 22 to 23 for example and the error is raised.

jbednar · October 16, 2022, 4:12pm

The kdim and by both specify a categorical plot by k, and so you’ll need to remove at least one and probably both of those if you don’t want a categorical plot. Once you do, it shouldn’t be using color_key or caring how many colors there are.

CrashLandonB · October 18, 2022, 4:04pm

Yep, after removing both, the plotting executed without error.

I’m noticing a couple other odd behaviors, but I’ll save that for another post and call this one solved. Thanks!