How to generate data for osm-3billion example

Hi,
I’m datashader + Python beginner and I want to reproduce this example: Osm-3billion — Examples 0.1.0 documentation

This is by far not enough to figure out what to do, to actually create the mentioned osm-3billion.parq.
I do see that the 1 billion point example gets distributed, but that’s too small to be meaningful.
It doesn’t help that the spacial indexing link is broken and datashader/2_Points.ipynb at 31b31826d74da38e4878f87c3e44511d08d302df · holoviz/datashader · GitHub seems to be under construction.

I do have the 3 billion point dataset ready and loaded, but I have no idea what combination of dask/datashader/geopanda tools I need to use to actually create the optimized parquet file.
Any help would be much appreciated :slight_smile:

Best,
Simon

Decided to get the 1 billion example running first, but it’s also outdated.
After lots of googling, I found that the mentioned way to load the parquet file in Osm-1billion — Examples 0.1.0 documentation is deprecated and now finally have the loaded data, but Just can’t figure out how to display it interactively with zooming.
Since I’m in the REPL, the bokeh example at the end doesn’t seem to display anything without a notebook environment.
Then I tried to combine it with the matplotlib example from the docs, it’s not clear how to merge those examples, since they use quite different constructs.

If I use the way to construct agg from the matplotlib example I get this wonderful error:


If I use the way to create points from the 1billion example I get this error after a minute of calculating and freezing my PC:

>>> hd.shade(hv.Image(points))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\holoviews\element\raster.py", line 276, in __init__
    Dataset.__init__(self, data, kdims=kdims, vdims=vdims, extents=extents, **params)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\holoviews\core\data\__init__.py", line 334, in __init__
    super().__init__(data, **dict(kwargs, **dict(dims, **extra_kws)))
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\holoviews\element\raster.py", line 51, in __init__
    super().__init__(data, kdims=kdims, vdims=vdims, extents=extents, **params)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\holoviews\core\dimension.py", line 844, in __init__
    super().__init__(data, **params)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\holoviews\core\dimension.py", line 503, in __init__
    super().__init__(**params)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\parameterized.py", line 3173, in __init__
    self.param._setup_params(**params)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\parameterized.py", line 1387, in override_initialization
    fn(parameterized_instance, *args, **kw)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\parameterized.py", line 1641, in _setup_params
    setattr(self, name, val)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\parameterized.py", line 369, in _f
    return f(self, obj, val)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\parameterized.py", line 1201, in __set__
    self._validate(val)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\__init__.py", line 1442, in _validate
    self._validate_bounds(val, self.bounds)
  File "C:\Users\sdani\SimiWorld\ProgrammerLife\env\lib\site-packages\param\__init__.py", line 1456, in _validate_bounds
    raise ValueError("%s: list length must be at least %s."
ValueError: vdims: list length must be at least 1.

Anyways, can anyone give me a short snippet to get the dask dataframe into a zoomable window?
I’m here:

import datashader as ds
import dask.dataframe as dd
df = dd.read_parquet('osm-1billion.parq')
df = df.persist()

# Wrong code, please help from here ;) 
import holoviews as hv
from holoviews import opts
from holoviews.operation.datashader import datashade, dynspread
hv.extension("bokeh", "matplotlib")
hv.output(backend="matplotlib")
agg = ds.Canvas().points(df,'x','y')
hd.shade(hv.Image(agg))

Hey Simon! did you manage to figure out how to generate the osm-3billion.parq file from the csv/txt file? I’m not sure how to proceed. Need some help here. Thanks