I am able to process 5GB worth snappy compressed parquet local files and visualize them using holoviews, datashader, panel and bokeh on my mac. However, each user query takes a few seconds or even minutes sometimes and the visualization appears at once after all processing. I was wondering if there is any native support to progressively update the visualization from a dask dataframe as and when parts of the result are available.
import param, panel as pn import dask.dataframe as dd import holoviews as hv class DataExplorer(param.Parameterized): xaxis_filter = param.Range(...) def load_data(self): return dd.read_parquet("*.parquet") def process_data(self, df): # Expensive compuation on a dask dataframe for example, return df[df.xaxis.between(*self.xaxis_filter)] @param.depends('xaxis_filter') def make_view(self): plot_df = self.process_data(self.load_data()) points = hv.Points(plot_df) # too slow and pub-sub behaviour might help scatter = (dynspread(datashade(points)) tooltip = rasterize(points, streams=[RangeXY]).apply(hv.QuadMesh) return (scatter * tooltip).opts(opts.QuadMesh(tools=['hover'], alpha=0, hover_alpha=0.2)) explorer = DataExplorer(name="") dashboard = pn.Column(explorer.param, explorer.make_view)
I tried streaming visualizations but the size of
Buffer started growing drastically making the implementation unusable on my mac. Any ideas or help is appreciated.