Panel servable app memory usage

Hello,

I’m running a servable panel app. I have the latest version installed via conda. When I connect to my app via the browser, the app is executed and takes up some memory. When I close the tab, open another tab and connect again, the memory used increases. If I keep doing this, the memory consumption keeps increasing until the machine runs out of memory.
Any ideas what could be causing this? I’m running the app behind nginx.
TIA!

I found the check-unused-session and unused-session-lifetime arguments to panel serve. I set these to 5 seconds and 1 second respectively. This does not reduce the memory consumption either. I watched top for a couple of minutes, but panel’s memory usage does not reduce after a browser session is closed.

Is there any chance you can provide a reproducible example here? I’ve tested simple apps and they do not leak memory.

Oh, that’s interesting. Guess one of the many moving parts in my app is screwing things up.

The app itself is too unwieldy to provide as an example. I’ll try cleaning up the code to see if I can make a reproducible example. In case things suddenly work better, I’ll report that too.

Ok, here is a minimal example which keeps leaking memory on my machine (Mac High Sierra)

https://drive.google.com/drive/folders/14yFyw-C2-8aLVyPa81o7r9SAgHW2nQmC?usp=sharing

I ran the app using

panel serve CoronaStory_Minimal.ipynb --check-unused-sessions 5000 --unused-session-lifetime 1000

The folder also contains a requirements.txt generated from pip freeze.

Thanks for looking into this!
Joy

1 Like

Perfect thank you!

Not sure if this helps; I ran panel serve with log level of trace, and I see that the relevant cleanup code (in bokeh’s server/session.py with all the dels) is being called some time after I close the browser tabs. However, the memory usage remains constant.

Is there something that I can try to help understand this issue better?

Thanks!

I was digging a little more, and here’s what I learned.

I wrote my app as a Parameterized class, and all the big variables were allocated
as globals and used within the class methods when needed.

Now since these variables are allocated only once, the memory usage stays small even if I open multiple versions of the app. It keeps increasing, but by a very small amount.

This is how it looks (the relevant parts):

with open('datasets/entity_list_df.pkl', 'rb') as pkl:
    ENTITY_LIST_WORDCLOUD = pickle.load(pkl)

with open('datasets/papers_publish_time.pkl', 'rb') as pkl:
    PUBLISH_TIME = pickle.load(pkl)

START_TIME = PUBLISH_TIME.index[0].to_pydatetime()
END_TIME = PUBLISH_TIME.index[-1].to_pydatetime()

OMIT_WORDS_FROM_CLOUD = ['virus', 'viruses',
                         'Abstract', 'results',
                         'infection', 'infected',
                         'days', 'study', 'associated with',
                         'BACKGROUND', 'review', 'information',
                         'investigate', 'findings', 'studies']

class WordCloud(param.Parameterized):
    wordcloud_date_range = None

    def __init__(self):

        self.wordcloud_date_range = pnw.DateRangeSlider(
                                 name='Choose time period',
                                 start=START_TIME, end=END_TIME,
                                 value_throttled=(START_TIME, END_TIME),
                                 callback_throttle=1000)

        super().__init__()

on the other hand, if I override __init__ and allocate the variables as part of the class itself, the memory usage shoots up even when the bokeh server destroys the session.

This is how it looks (the relevant parts):


class WordCloud(param.Parameterized):
    wordcloud_date_range = None

    def __init__(self):

        with open('datasets/entity_list_df.pkl', 'rb') as pkl:
            self.entity_list_wordcloud = pickle.load(pkl)

        with open('datasets/papers_publish_time.pkl', 'rb') as pkl:
            self.publish_time = pickle.load(pkl)

        self.start_time = self.publish_time.index[0].to_pydatetime()
        self.end_time = self.publish_time.index[-1].to_pydatetime()

        self.omit_words_from_cloud = ['virus', 'viruses',
                         'Abstract', 'results',
                         'infection', 'infected',
                         'days', 'study', 'associated with',
                         'BACKGROUND', 'review', 'information',
                         'investigate', 'findings', 'studies']

So, it looks like variables allocated within a panel class do not get garbage collected once the session is destroyed. I have tried, and it does not matter if these variables are
class attributes or instance attributes.

Is this expected behaviour?

Thanks for investigating that’s a solid starting point. This behavior is definitely not expected.

@philippjfr could you point me to where in the code (bokeh or panel) should I look to understand why something like this might happen?

This will soon become time-critical and I would like to see if I can contribute to understanding or fixing the problem.

Debugging memory leaks is always a pain. Haven’t had a chance to look yet, but I’ll have to take a look before the next release. Should have some time this weekend.

Thanks! I really appreciate all you help in this matter.

Was a solution to this problem ever found?

2 Likes

I wonder if it’s due to the constants being used? What if these were dumped in the class instead?

with open('datasets/entity_list_df.pkl', 'rb') as pkl:
    ENTITY_LIST_WORDCLOUD = pickle.load(pkl)

with open('datasets/papers_publish_time.pkl', 'rb') as pkl:
    PUBLISH_TIME = pickle.load(pkl)

START_TIME = PUBLISH_TIME.index[0].to_pydatetime()
END_TIME = PUBLISH_TIME.index[-1].to_pydatetime()

OMIT_WORDS_FROM_CLOUD = ['virus', 'viruses',
                         'Abstract', 'results',
                         'infection', 'infected',
                         'days', 'study', 'associated with',
                         'BACKGROUND', 'review', 'information',
                         'investigate', 'findings', 'studies']