`param.watch` and `onlychanged=True` unexpected behaviour on param.Dataframe

Hello,

Given this example:

import pandas as pd
import param

df = pd.DataFrame(
    {"a": [1,2,3]}
)

class A(param.Parameterized):

    df = param.DataFrame(default = df)
    x = param.Integer(default = 1)

    def __init__(self, **params):
        super().__init__(**params)

        self.param.watch(fn=self.test_x, parameter_names=['x'], onlychanged=True)
        self.param.watch(fn=self.test_df, parameter_names=['df'], onlychanged=True)

    def set_df(self, df):
        self.df = df

    def set_x(self, x):
        self.x = x

    def test_x(self, event):
        print(f"x is {event.type}")

    def test_df(self, event):
        # print(event.old.equals(event.new))
        print(f"df is {event.type}")

a = A()

a.set_df(df)
a.set_x(1)

>>> df is changed

I was wondering why in the case of the integer, param correctly detects that the new value set on x is the same as the old one and so doesn’t print "x is changed", while re-setting df to an identical value does trigger the watched function, even though onlychanged=True?

I guess it has something to do with how event.old and event.new are being compared to each other? In the case of the dataframe I would like it to check for something like event.old.equals(event.new) (using pandas .equals method), but I don’t really know where to change this.

Any ideas or hints? Thanks!

These two issues over on Github pointed me in the right direction:

Not sure exactly if this is a stupid thing to do (I’m just starting to learn Holoviz stuff…), but adding this before the script in the question fixes it:

param.parameterized.Comparator.equalities.update(
    {pd.DataFrame: lambda obj1, obj2: obj1.equals(obj2)}
    )

Hey @bertcoerver, I believe this isn’t done by default because comparing two very large DataFrames could come with a big performance hit. If you deal with small-ish DataFrames, what you’re doing look totally reasonable to me.

1 Like