Hvplot, link selection, interactive() and Datashader: Is it possible?

I am in process of creating a FastAPP-based Panel app serving, among others, a visualization app which consists of the following components:

  • 1 DateRangeSlider widget,
  • 1 Float slider widget,
  • 1 scatter hvplot of around 1 million points, and
  • 3 histagrams ploting the count of the ranges in ‘Speed’, ‘Altitude’ and ‘Accuracy’ fo rth eall GPS coordinates.
  • some data informative displays showing ‘Colories burned’, ‘Avrage CO2 sparement’, ‘Trafic Rate vs Time’, etc etc (code not presented).

My GPS data points are collected over a period of time which means that I can do selection based on timestamp, hence having a DateRangeSlider.
So far the original data file consists of around 10 million records (GPS poins). Here I am providing around around 2 million points only, related to one of the cities in the project. The data that I am providing is not “cleaned” and contains only 16 needed columns for this topic.

Problems/questions:

I am facing two scenarios:

A. If I want to show all the data by rasterizing the scatter hvplot then I can keep the “link selection” between all 4 plots but the sliders can only be connected to the 3 histograms (by using ‘.interactive()’). If I try to connect the sliders to the scatter plot then I get the “Exception: Nesting a DynamicMap inside a DynamicMap is not supported” error message. Disadvantage: I can not see the “date” effect on the visualization.

B. If I want to have/use the ‘link selection’ between the 3 plots and at the same time keep the interactivity between the plots and the sliders then i have to reduce the number of GPS data points by a factor of 0.03 and disable rasterize/datashader function. Disadvantage: only small part of the data is visualized due to the limitation of the browser. Also the plotted points are very grainy.

Now: I want the best of the both worlds! I want to have all the GPS data points in the plot and in rasterized format while keeping the interactivity between all the plots and the slider widgets. Is it possible? What/how should I do?

I feel that I have missed something very obvious but I do not know what!
A video about the “problem”:

The code:

# General modules
import pandas as pd
import panel as pn
from panel.theme import Material
import colorcet as cc

import datashader as ds
import hvplot.pandas
import pyarrow.parquet

# Holoviews
import holoviews as hv
from holoviews.operation.datashader import datashade, rasterize
from holoviews.selection import link_selections

hv.extension('bokeh')
pn.extension('tabulator')
pn.config.design = Material


# Read the parquet file into a data frame taken from Glaciers project
def load_data():
    gps = pd.read_parquet('Karlskrona_public.parquet', engine='pyarrow')
    gps['latdeg'] = gps.northing
    gps['x'], gps['y'] = (gps.easting, gps.northing)
    return gps

# Cache the data
gps = pn.state.as_cached('glaciers', load_data)


# Create a sample data frame (used if not datashader is applied)
rgps = gps.sample(frac=0.3)
rgps.head()

# Make the dataset interactive (https://holoviz.org/tutorial/Interactive_Pipelines.html)
gpsi = gps.interactive()

# Make 'Date' range sliders (https://panel.holoviz.org/reference/widgets/DateRangeSlider.html)
date_slider = pn.widgets.DateRangeSlider(
    name='Date', start=gps.time.iloc[0], end=gps.time.iloc[-1])

# Filter the dataframe interactively using the time slider
# Interactive Time Filtered Data = itfd
itfd = gpsi[['time', 'speed', 'accuracy', 'altitude', 'easting', 'northing']][
    (gpsi['time'] >= date_slider.param.value_start) &
    (gpsi['time'] <= date_slider.param.value_end)
    ]

# Define link selection between components
ls = hv.link_selections.instance(unselected_alpha=0.02)

# Plot the GPS points
# Alternative plot: using hv.Points
# USING 'gpsi' INSTEAD OF 'gps' WILL GIVE RAISE TO "NESTED ..." ERROR
gps_plot = gps.hvplot('easting',
                      'northing',
                      kind='scatter',
                      rasterize=True,
                      cmap=cc.fire,
                      cnorm='eq_hist',
                      xlabel='Longitude',
                      ylabel='Latitude',
                      responsive=True,
                      height=600
                      )
# link selection on scatter plot
lsgp = ls(gps_plot)

# Group the plot with the background map
geo_plot = hv.element.tiles.OSM() * lsgp


# Plot the Speed histogram 'sph'
# NOTE: if 'gps' is used as the data for the plot, the connection to the date range slider breaks.
sph = itfd.hvplot(
    y='speed', xlabel='Speed', ylabel='Count', kind='hist', responsive=True, min_height=200, fill_color='#88d8b0', alpha=1.0) # groupby='activity.activity_type', ????? #f1948a
# link selection on speed hist (lssh)
lssph = ls(sph.holoviews())  # .holoviews()


# Plot the Accuracy histogram 'ach'
ach = itfd.hvplot(
    y='accuracy', xlabel='Accuracy', ylabel='Count', kind='hist', responsive=True, min_height=200, fill_color='#85c1e9', alpha=1.0)
# link selection on accuracy hist (lsah)
lsach = ls(ach.holoviews())  # .holoviews()


# Plot the Altitude histogram 'alh'
alh = itfd.hvplot(  # IF 'gps' IS used INSTEAD OF 'itfd' THE GRAPHS WILL BE EMPTY
    y='altitude', xlabel='Altitude', ylabel='Count', kind='hist', responsive=True, min_height=200, fill_color='#34d1e2', alpha=1.0)
# link selection on accuracy hist (lsalh)
lsalh = ls(alh.holoviews())  # .holoviews()

# Group the histograms
slshists = pn.Row(lssph, lsach, lsalh)


# Create a linked data-points table
table = (itfd.pipe(ls.filter, selection_expr=ls.param.selection_expr)
         .drop(columns=['easting', 'northing'])
         .pipe(pn.widgets.Tabulator, pagination='remote', page_size=15, theme='fast')
         )

# Creating tabs to use in the layout
tabs = pn.Tabs(
    ('Plots', pn.Row(pn.Column(geo_plot, slshists))),
    ('Table', table.panel()), sizing_mode='stretch_width'
)

# ----------------------------------------------------------------------------------

# Creating Dashboard: Define layout
dynamic_layout = pn.template.FastListTemplate(
        title='Project - Visualization',
        sidebar=[
            pn.Row(pn.Column(pn.pane.Markdown("### Some text about the project"),
                             pn.pane.Markdown("Select date:"),
                             pn.Column(date_slider),
                             pn.pane.Markdown("Some other things down here")))        ],
        main=pn.Row(tabs),
        theme_toggle=False)
dynamic_layout.servable();

The data file:

Send me a “Hi” if you have questions :wink:

1 Like

After a little rewriting of the code I can now answer my own question: “Hvplot, link selection, interactive() and Datashader: Is it possible?” My answer now is YES!
The functional code (not sure if it is the “correct” code since I get some error message in the terminal stating “WARNING:root:Dropping a patch because it contains a previously known reference (id=‘p13989’). Most of the time this is harmless and usually a result of updating a model on one side of a communications channel while it was being removed on the other end.” :joy:) is as follow:

# General modules
import pandas as pd
import panel as pn
from panel.theme import Material
import colorcet as cc

import datashader as ds
import hvplot.pandas
import pyarrow.parquet

# Holoviews
import holoviews as hv
from holoviews.operation.datashader import datashade, rasterize
from holoviews.selection import link_selections

hv.extension('bokeh')
pn.extension('tabulator')
pn.config.design = Material


# Read the parquet file into a data frame taken from Glaciers project
gps = pd.read_parquet('Karlskrona_public.parquet', engine='pyarrow')

# Make the dataset interactive
gpsi = gps.interactive()

# Link Selection declaration
ls = hv.link_selections.instance(unselected_alpha=0.2)

# Make 'Date' range sliders
date_slider = pn.widgets.DateRangeSlider(name='Date', start=gps.time.iloc[0], end=gps.time.iloc[-1])

# Filter the interactive dataframe by the time slider
itfd = gpsi[['easting', 'northing', 'longitude', 'latitude', 'time', 'userid', 'speed', 'accuracy', 'altitude', 'distance']][
    (gpsi['time'] >= date_slider.param.value_start) &
    (gpsi['time'] <= date_slider.param.value_end)]

# Plot the data with datashade and link selection
geo = itfd.hvplot.points(
    x='easting', y='northing', rasterize=True,
    xlabel='Longitude', ylabel='Latitude', cmap=cc.fire, cnorm='eq_hist',
     responsive=True, min_height=600, dynspread=True, dynamic=False)

geo_plot = hv.element.tiles.OSM() * ls(geo.holoviews())

# Plot the Speed histogram, 'sph'
sph = itfd.hvplot(y='speed', xlabel='Speed', ylabel='Count', kind='hist', responsive=True, min_height=200, fill_color='#88d8b0', alpha=1.0) 

# link selection on speed hist (lssph)
lssph = ls(sph.holoviews())


# Create a linked data-points' table
table = (itfd.pipe(ls.filter, selection_expr=ls.param.selection_expr)
         .drop(columns=['accuracy', 'altitude', 'easting', 'northing'])
         .pipe(pn.widgets.Tabulator, pagination='remote', page_size=10, theme='fast')
# Creating tabs to use in the layout
tabs = pn.Tabs(
    ('Plots', pn.Row(pn.Column(geo_plot, slshists))),
    ('Table', table.panel()), sizing_mode='stretch_width'
)

# ----------------------------------------------------------------------------------

# Creating Dashboard: Define layout
dynamic_layout = pn.template.FastListTemplate(
        title='Project - Visualization',
        sidebar=[
            pn.Row(pn.Column(pn.pane.Markdown("### Some text about the project"),
                             pn.pane.Markdown("Select date:"),
                             pn.Column(date_slider),
                             pn.pane.Markdown("Some other things down here")))],
        main=pn.Row(tabs),
        theme_toggle=False)
dynamic_layout.servable();

If I use the ‘DateRangeSlider’ and the selection tools (Box_selection or Lasso_selection) on the plot, the content of the rasterized plot, the histogram and the table gets updated and reflects the change in the selection.

What now?
My challenge now is to find (i) the number of points selected, (ii) the number of unique 'userId’s in the selected area.

I have not been able to extract/access the data points from the selection expression. I tried to follow the code for creating the table above but code not figure it out. I have to ask for help on this matter in another section of this discourse.