Loading/rendering time in browser

Dear all,

I am using panel/holoviews/hvplot for dashboards for last two years in my company but it’s loading/rendering time in browser is always slow and it took to 2 to 3 minutes to load depending on the type of holoviews i used .

I tried every options explained here and in bokeh disclourse but it always took 2 to 3 minutes as it’s understandable that bokeh rendering always tooks time.

As the same time i am also expert in tableau and it’s dashboard always loaded in the quick time and it’s understandable it’s a paid and industry leader in it.

Always in comparison (python ETL + panel app dashboard) vs (python ETL + Tableau dashboard)

Your intake will be highly appreciated

1 Like

Hi @khannaum

Could you post a minimum, reproducible example that takes 2-3 minutes to load?

In my experience the loading/ rendering time is from below 1s to 5s depending on what I do.

To add to @Marc’s answer, bokeh rendering happens entirely on the browser side so it has zero effect on the time it takes to render the initial page. I remember going over this with you a long time ago but without some example it was always impossible to figure out what exactly was taking up so much time. What I might recommend is that you install panel 0.13.0dev6 and pyinstrument and then launch your application with the following options panel serve your_app.py --admin --profiler pyinstrument and then navigate to http://localhost:/admin and specifically the Launch Profiler tab and then provide us with that output.

@philippjfr @Marc thanks for the feedback and concern

Let me summarize what i am doing

  1. command line:

/usr/bin/nohup /home/NCOR6649/anaconda3/bin/panel serve main_cust_dash_board_hv.ipynb --port 5001 --address 10.173.2.243 --allow-websocket-origin=10.173.2.243:5001 --show --session-token-expiration 900000 --unused-session-lifetime 60000000 --keep-alive 10000000 --websocket-max-message-size 104857600000 &

  1. main_cust_dash_board_hv.ipynb attached main_cust_dash_board_hv.ipynb (35.0 KB)

  2. Final code

tab_dash=pn.Tabs(
         ('Nation Wide Trends',nwd_trends() ),
         ('Geo Mapping', cust_mapping()),
        ('Site Wise Complaints', site_menu(site_wise_plots)),
        ('Number Wise Complaints', msidn_menu()),
        ('Heat Maps',rbu_heatmap()),
        ('KPI Analysis',pt_stat()),
         dynamic=True
    )
time.sleep(0.5)

tmpl = pn.template.FastListTemplate(title='Jazz Customer Complaints')

tmpl.main.append(tab_dash)

tmpl.servable('Customer Complaints')
  1. template and tab take 2 to 3 minutes to load . But once loaded then when click on any tab bokeh rendered graphs easily.

I tried to run it, but the files with the data are missing . :frowning: You can attach it here I think.

It is nice if you can provide a shorter example so the people here can try it, and help you in a more effective way !

@nghenzi thanks for the try

Due to privacy issue data cannot be shared

It is in pandas dataframe max memory in 50 MB when processed.

Notebook contains function that return holoviews/hvplot/Matplot graphs for every tabs

Kindly review the script and suggest

Why template and tabs take time to load on other LAN nodes browser
But when loaded and when clicked on any tab menu items its bokeh graphs displays quickly

What I meant, you can fabricate fake data, for example numpy.random.random(matrix.shape) so the people who have free time to check this problem can help you. It is a nice gesture to provide a minimun reproducible example so people can check it fast. The last thing, there is several exec commands with some missing py file which I do not know if that command is doing something.

In short, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem.

2 Likes

I second @nghenzi . Please take the time to make a minimum, reproducible example. Without it, it can take a very long time to understand the issue or even make it impossible. Thanks.

@Marc @nghenzi thanks for the advice.

files shared

notebook:
main_cust_dash.ipynb (23.2 KB)

cust_complaints csv file link:

https://drive.google.com/drive/folders/15XYKBxuO_H65G0ipoM5GWq6UNvUcEMLP?usp=sharing

I believe you have to change the permissions of the drive file to whichever who has the link

ok

I was running your example (with panel 0.13.0a5.post19+ga2e4c848 and bokeh 2.4) and effectively it has some problem, but I could not see it. Sometimes the bokeh messages hass this problem of the bokeh referrers

and the app keeps increasing the memory each time 1 new session is launched.

It begins with rendering time in 20 seconds and after 5 or 6 sessions it renders now in 110 seconds. Each time you run it, it increases 10 seconds. The memory begins in 600 Mb and it continues increasing till 1300 Mb now.

The app is really big and messy in order to debug, beyond my time capabilities. The only advice I have is related to avoid nested layout as much as you can. In some case you have inside a Tab something like pn.Column(pn.FlexBox(pn.indicators.Trend(w=200, h=200), which is completely unnecesary. In any case, the problem is not related to this, but with the memory leak. If you delete the column and the flexbox you reduce the rendering time by 1 or 2 seconds.

the first time

after one rendering

after several renderings

I copied the code below with several things deleted, but the improve is marginal as compared with the increase produced in each session. The data is the csv which was shared previously.

import numpy as np, pandas as pd
import panel as pn, time, datetime

import colorcet as cc
from colorcet.plotting import swatch

from bokeh.settings import settings
from bokeh.models import HoverTool
from bokeh.models.widgets.tables import DateFormatter,NumberFormatter
settings.resources = 'cdn'
settings.resources = 'inline'

import holoviews as hv
from holoviews import dim, opts
from holoviews.element import tiles
from holoviews.element.tiles import StamenTerrain

import hvplot.pandas 
import datashader

pn.extension('tabulator',sizing_mode='stretch_width')

LSM_cmap = cc.CET_L4[::-1]

cus_compl = pd.read_csv('cust_compl.csv')
cus_compl['DATE'] = cus_compl['DATE'].astype('datetime64[ns]')
cus_compl['Month'] = cus_compl['DATE'].dt.month
cus_compl['Year'] = cus_compl['DATE'].dt.year
cus_compl['Month_Year'] = pd.to_datetime(cus_compl.Month.astype(str)+"-"+cus_compl.Year.astype(str))
cus_compl = cus_compl.reset_index()

cus_compl.drop_duplicates(inplace=True)


def create_trend(cus_compl):
    cus_compl_val = cus_compl.Month_Year.value_counts().rename_axis('Year_Month').reset_index(name='counts').sort_values(by='Year_Month')
    cus_compl_val['cum'] = cus_compl_val.counts.cumsum()
    cus_compl_val_daily = cus_compl[cus_compl.DATE> datetime.datetime.now() - pd.to_timedelta("30day")]['DATE'].dt.date.value_counts().rename_axis('Year_Month').reset_index(name='counts').sort_values(by='Year_Month')
    cus_compl_val_daily['cum'] = cus_compl_val_daily.counts.cumsum()
    
    data_daily = {'x': cus_compl_val_daily.index, 'y': cus_compl_val_daily.counts.values}
    data = {'x': cus_compl_val.index, 'y': cus_compl_val.counts.values}
    
    trend_monthly = pn.indicators.Trend(title='Nation Wide Trend', data=data, width=200, height=200)
    trend_daily = pn.indicators.Trend(title='Daily_Trend', data=data_daily, width=200, height=200)
    
    return trend_monthly,trend_daily


def create_trend_reg(cus_compl,reg):
    cus_compl_val = cus_compl[(cus_compl.RBU==reg)].Month_Year.value_counts().rename_axis('Year_Month').\
                    reset_index(name='counts').sort_values(by='Year_Month')
    cus_compl_val['cum'] = cus_compl_val.counts.cumsum()
    data = {'x': cus_compl_val.index, 'y': cus_compl_val.counts.values}
    trend_monthly = pn.indicators.Trend(title=F'{reg}', data=data, width=200, height=200)
    return trend_monthly

class groups:    
    def complaint_grp_count(*arguments):
        '''
        time_v='30day'
        complaints='compl'
        relat=(Region,MBU,MSISDN,CITY,SiteCode)
        '''
        aList = list(arguments)
        if len(aList)>3:
            time_v=aList[-1]
        else:
            time_v="30day"

        aList = list(arguments)
        data = aList[0]
        groupby_wk = list(aList[1:-1])
    
        return data[data.DATE> datetime.datetime.now() - pd.to_timedelta(time_v)].groupby(groupby_wk)['MSISDN'].count().reset_index(name='counts').sort_values(by='counts')
    
    def site_grp_count():
        return cus_compl.groupby(['SiteCode','Month_Year',
            'compl','RBU','MBU','cityname','MSISDN','Longitude',
            'Latitute'])['MSISDN'].count().reset_index(name='counts').sort_values(by='counts')
    
    def msisdn_grp_count():
        return cus_compl.groupby(['MSISDN','Month_Year',
            'compl','RBU','MBU','cityname','SiteCode','Longitude',
            'Latitute'])['SiteCode'].count().reset_index(name='counts').sort_values(by='counts')
    
    def monthly_grp_count():
        return cus_compl.groupby(['Month_Year',
            'compl','RBU','MBU','cityname',
            'SiteCode'])['MSISDN'].count().reset_index(name='counts').sort_values(by='counts')


def rbu_heatmap():

    rbu_heatmap_comp = groups.monthly_grp_count().hvplot.heatmap(title='Complaints w.r.t Regions',x='Month_Year', 
                        y='compl',groupby='RBU', C='counts', reduce_function=np.sum ,cmap=LSM_cmap,colorbar=True,
                        height=500,width=750).opts(toolbar=None )

    rbu_heatmap_mbu = groups.monthly_grp_count().hvplot.heatmap(title='Complaints w.r.t MBU',x='Month_Year', y='MBU',
                        groupby='RBU', C='counts', reduce_function=np.sum ,cmap=LSM_cmap,colorbar=True,height=500,
                        width=750).opts(toolbar=None )

    rbu_heatmap_city = groups.monthly_grp_count()[groups.monthly_grp_count().counts>20]\
        .hvplot.heatmap(title='Complaints w.r.t Cities',x='Month_Year', y='cityname',groupby='RBU', 
        C='counts', reduce_function=np.sum ,cmap=LSM_cmap,colorbar=True,height=500,width=750).opts(toolbar=None ) 

    rbu_heatmap_comp = rbu_heatmap_comp.opts(tools=[HoverTool(tooltips=[('Complaint',"@compl"),('Count',"@counts"),
                       ('Month', '@{Month_Year}{%Y-%m}'),], formatters={'@{Month_Year}': 'datetime' })])
    rbu_heatmap_mbu = rbu_heatmap_mbu.opts(tools=[HoverTool(tooltips=[('MBU',"@MBU"),('Count',"@counts"),
                       ('Month', '@{Month_Year}{%Y-%m}'),],formatters={'@{Month_Year}': 'datetime'})])
    rbu_heatmap_city = rbu_heatmap_city.opts(tools=[HoverTool(tooltips=[('City',"@cityname"),('Count',"@counts"),
                          ('Month', '@{Month_Year}{%Y-%m}'),],formatters={'@{Month_Year}': 'datetime'})])
    rbu_heatmap_temp = (rbu_heatmap_comp+rbu_heatmap_mbu+rbu_heatmap_city).cols(2)
    rbu_heatmap_total = pn.panel(rbu_heatmap_temp,widgets={'RBU': pn.widgets.Select},widget_location='top_left')
    
    return rbu_heatmap_total


def nwd_trends():    
    yearly_comp_gph = groups.complaint_grp_count(cus_compl,'Month_Year','compl',"1200day").hvplot.scatter(x='Month_Year',y='counts',by='compl',dynamic=True,shared_axes=False,use_index=False,legend='right')\
    .opts(toolbar=None,height=500,width=1000).opts(opts.Scatter(color=hv.Cycle('Category20'), line_color='k',size=dim('counts')/700,
                       show_grid=True, width=1000, height=1400), opts.NdOverlay(legend_position='left', show_frame=False)        )

    year_reg_grp = groups.complaint_grp_count(cus_compl,'Month_Year','compl','Region',"1200day")\
               .hvplot.line(x='Month_Year',y='counts',by='compl',groupby='Region',\
                    width=1550, height=600, dynamic=True,shared_axes=False).layout().cols(1).relabel('Complaints Trends')

    top_msisdn_comp_gph = groups.complaint_grp_count(cus_compl,'MSISDN','compl','30Days').sort_values(by='counts').nlargest(10,'counts').sort_values(by='counts')\
        .hvplot.barh(title='Top 10 Numbers', x='MSISDN',y='counts',by='compl',stacked=True,
        shared_axes=False).opts(toolbar=None,width=900, height=500)

    top_reg_comp_gph = groups.complaint_grp_count(cus_compl,'Region','compl','30Days').sort_values(by='counts')\
        .hvplot.barh(title='Region Wise',x='Region',y='counts',by='compl',stacked=True,
        shared_axes=False).opts(toolbar=None,width=900, height=500)

    top_city_comp_gph = groups.complaint_grp_count(cus_compl,'CITY','compl','30Days').sort_values(by='counts').nlargest(30,'counts').sort_values(by='counts')\
        .hvplot.barh(title='Top 30 Cities',x='CITY',y='counts',by='compl',stacked=True,
        shared_axes=False).opts(toolbar=None,width=900, height=500).sort(['compl','CITY'],reverse=True)

    top_mbu_comp_gph = groups.complaint_grp_count(cus_compl,'MBU','compl','30Days').sort_values(by='counts').nlargest(15,'counts').sort_values(by='counts')\
        .hvplot.barh(title='Top 15 MBU',x='MBU',y='counts',by='compl',stacked=True,
        shared_axes=False).opts(toolbar=None,width=900, height=500)

    top_site_comp_gph = groups.complaint_grp_count(cus_compl,'SiteCode','compl','30Days').sort_values(by='counts').nlargest(20,'counts').sort_values(by='counts')\
    .hvplot.barh(title='Top 20 Sites',x='SiteCode',y='counts',by='compl',stacked=True,shared_axes=False).opts(toolbar=None,width=900, height=500)

    comp_perc_gph = ((groups.complaint_grp_count(cus_compl,'compl','30Days').set_index('compl').plot.pie(y='counts',title="Complaints", legend=False, \
                       autopct='%1.1f%%', \
                       shadow=False,figsize=(10,10),ylabel='',\
                    startangle = 180)).get_figure())

    daily_trends = pn.Column(pn.Spacer(height=30),\
                        pn.Row(create_trend(cus_compl)[1],pn.pane.Matplotlib(comp_perc_gph,dpi=450)) ,\
                        pn.Spacer(height=30),\
                        pn.pane.Markdown("## Top Last 30 days Trends", sizing_mode="stretch_width"),\
           (top_reg_comp_gph+top_mbu_comp_gph+top_city_comp_gph+top_msisdn_comp_gph+top_site_comp_gph).cols(2),\
    sizing_mode='stretch_width')

    yearly_trends = pn.Column( pn.Spacer(height=30),create_trend(cus_compl)[0],pn.Spacer(height=30),
                        pn.Row(create_trend_reg(cus_compl,'South'),create_trend_reg(cus_compl,'North'), 
                        create_trend_reg(cus_compl,'Central B'),create_trend_reg(cus_compl,'Central A')  ),         
                        year_reg_grp.relabel('Yearly Complaints Analysis')                 )

    return pn.Row(pn.Tabs(('Yearly_Trends',yearly_trends),('Daily_Trends',daily_trends),tabs_location='above',dynamic=True))


def cust_mapping():
    cust_group_by_date=pd.DataFrame(groups.site_grp_count())
    x, y = datashader.utils.lnglat_to_meters(cust_group_by_date.Longitude, cust_group_by_date.Latitute)
    cust_group_by_work_projected = cust_group_by_date.join([pd.DataFrame({'easting': x}), pd.DataFrame({'northing': y})])

    RBU_select = pn.widgets.Select(name='RBUs', options=list(cust_group_by_work_projected.RBU.unique()))
    City_select =  pn.widgets.Select(name='CityLevel', options=list(sorted(cust_group_by_work_projected.cityname.unique())))
    Complaint_select = pn.widgets.MultiSelect(name='Complaints', options=list(sorted(cust_group_by_work_projected.compl.unique())))

    Month = pn.widgets.DateRangeSlider( name='Month',
        start=cust_group_by_work_projected.Month_Year.min(), end=cust_group_by_work_projected.Month_Year.max(),
        value=(cust_group_by_work_projected.Month_Year.min(), cust_group_by_work_projected.Month_Year.max()))

    hv.extension('bokeh')
    levels=[2,4,5]
    colors=['#0000FF','#FFFF00','#FF0000']

    wiki = tiles.StamenTerrain().redim(x='easting', y='northing')

    cust_group_by_work_hol=cust_group_by_work_projected.hvplot.points(x='easting', y='northing', c='counts', 
        hover_cols=['SiteCode', 'Month_Year','RBU','MBU','cityname','compl'], s='counts',scale=10, cmap=colors,
        height=850, width=1550 , xaxis=None, yaxis=None,use_index=False, legend=False,colorbar=True,
        dynamic=True).opts(toolbar='above').opts(tools=[HoverTool(tooltips=[('SiteCode',"@SiteCode"),
            ('Complaint',"@compl"),('Count',"@counts"),('Month', '@{Month_Year}{%Y-%m}'),('MBU', '@{MBU}'),
            ('RBU',"@RBU"), ('City', "@cityname"), (' ',"==============="),], 
             formatters={'@{Month_Year}': 'datetime'})])
    
    cust_group_by_work_hol=cust_group_by_work_hol.apply.opts(xticks=[1,3,5], clabel='Customer Complaints',clim=(1,10),color_levels=3,cmap='rainbow',
        colorbar_opts={  'major_label_overrides': { 0: 'none', 2: 'low',  5: 'high'    }       }        )

    cust_group_date_wise_b=wiki * (cust_group_by_work_hol.opts(toolbar='left')).apply.select(Month_Year=Month.param.value,
                cityname=City_select.param.value,compl=Complaint_select.param.value,watch=True)
   
    widgets = pn.WidgetBox(City_select,Complaint_select,Month,sizing_mode='fixed')
    date_format={'Month_Year': DateFormatter(), }
    filter_table=pn.widgets.Tabulator(cust_group_by_work_projected, pagination='remote', groupby=['Month_Year'], 
        width=800,layout='fit_data_table', formatters=date_format, show_index=False, )

    filter_table.add_filter(City_select,'cityname')
    filter_table.add_filter(Complaint_select,'compl')
    filter_table.add_filter(Month,'Month_Year')

    return pn.Column(widgets,pn.Column(cust_group_date_wise_b.opts(width=1550),filter_table),sizing_mode='fixed')
  

def msidn_menu():
    MSISDN_select = pn.widgets.Select(name='MSISDN', options=list(sorted(groups.msisdn_grp_count().nlargest(30,'counts').sort_values(by='counts').MSISDN.unique())))
    date_format={  'Month_Year': DateFormatter(),   'MSISDN': NumberFormatter(format='0'),   }
    MSISDN_table=pn.widgets.Tabulator(groups.msisdn_grp_count().sort_values(by='counts'),pagination='remote', groupby=['MSISDN'],\
                                      width=800,layout='fit_data_table',formatters=date_format,show_index=False,)
    MSISDN_table.add_filter(MSISDN_select,'MSISDN')
    return pn.Column(MSISDN_select,MSISDN_table)


tab_dash=pn.Tabs(
         ('Nation Wide Trends',nwd_trends() ),
         ('Geo Mapping', cust_mapping()),
        ('Number Wise Complaints', msidn_menu()),
        ('Heat Maps',rbu_heatmap()),
       dynamic=True
    )

tmpl = pn.template.FastListTemplate(title='Customer Complaints')
tmpl.header_background='Red'
tmpl.header_color='Blue'
tmpl.title='Customer Complaints'

tmpl.main.append(tab_dash)

tmpl.servable('Customer Complaints')

I hope you find a solution.

1 Like

you can try to save the csv data in pn.state.cache ir order to avoid to run it . There is several problems where you can improve, but pandas.read_csv takes 1 or 2 seconds more each time you call it

image

if you change the line

cus_compl = pd.read_csv('cust_compl.csv')

by

if 'data' not in pn.state.cache.keys():

    cus_compl = pd.read_csv('cust_compl.csv')

    pn.state.cache['data'] = cus_compl.copy()

else: 

    cus_compl = pn.state.cache['data']

it improves the rendering time. After 10 sessions, it keeps in 20 - 30 seconds of rendering time. However you need to improve your nwd_trends() function. I do not understand what it does, it performs a lot of pandas operations, one class and several panel objects altogether, which it makes difficult to see where the bottleneck is. I would try to divide the pandas operations and the (panel, holoviews, bokeh) operations and watch the profiler in order to define exactly where the problem is. I am inclined to believe that there is some problem with pandas, more than with bokeh, but it is hard to see it with this code structure.

4 Likes

Before nothing, I love the profiler and the admin site. I spent like one hour playing with it…

If you check the profiler, there is something happening with the legends outside of the plot I think. You have a lot of plots and stuff, so it is difficult to improve your code. You need to review it carefully and decrease the complexity. I think the difference you see with tableau is in that case you are an expert, then the code is really polished.

I stopped of using the code from @khannaum and I tried the following code

import numpy as np
import pandas as pd
import hvplot.pandas  # noqa
import panel as pn
pd.options.plotting.backend = 'holoviews'

index = pd.date_range('1/1/2000', periods=1000)
df = pd.DataFrame(np.random.randn(1000, 4), index=index, columns=list('ABCD')).cumsum()

plot = df.hvplot()

tmpl = pn.template.FastListTemplate(title='Customer Complaints')
tmpl.header_background='Red'
tmpl.header_color='Blue'
tmpl.title='Customer Complaints'

tmpl.main.append(plot)

tmpl.servable('Customer Complaints')

You can check that begin rendering with 1-2 seconds, and after several tries it takes 14 seconds. This is in big app generate the 100 seconds that @khannaum gets it.

image

2 Likes

In general avoid nesting layouts. The Bokeh layout engine is not capable of handling this well and will slow down the app. Use one of the built in templates or build your own.

In Bokeh 3.0 the bokeh layout engine should be fixed/ removed which should solve a lot of layout performance issues.

@Marc @nghenzi thanks for the help and feedback

From my code kindly identify nesting layouts so that i make it better and optimal

Previous year I had similar issues with another dashboard that uses data in GB
then i had reverted to vaex dataframe groupby instead of pandas.
and there is significant performances in reducing rendering time from 4-5 minutes to 2 minutes

In next few days i will share the same example with vaex dataframe groupby and we discuss on it.

but the problem is that we have to compete with same tableau dashboards which have no such issues with the same data

Hello! Have you used pyinstrument here or is it another package? Can I get more information/documentation on this as I need to do some diagnosis on my dashboard please?

I used pyinstrument, which it is used in the admin page in the dev version of panel.

you can check it here

https://pyviz-dev.github.io/panel/user_guide/Performance_and_Debugging.html?highlight=profiling

1 Like