Updating multiple parameters at once from a dictionairy?

LinuxIsCool · January 24, 2023, 5:59pm

I have a dataframe where each row represents a parameter set for a parameterized class. For each row, I want to compute a value based on the parameters and create a new column representing that computation. Is there a canonical way to achieve this? The following code works, but it’s instantiating a new object for every row. Would it be more computationally efficient to instantiate a single object and reuse it by updating it’s parameters instead of creating a new object for every row?

To be clear, the following code achieves my goal, but I’m wondering if there is a better way to do it. And i’m generally wondering about updating a parameter set on an instantiated object from a dictionairy.

import pandas as pd
import numpy as np
import param as pm

df = pd.DataFrame(np.random.randn(30, 2), columns=['a','b'])

class Test(pm.Parameterized):
    a = pm.Number()
    b = pm.Number()
    
    def c(self):
        return self.a + self.b

df['c'] = [Test(**ab).c() for ab in df.to_dict(orient='records')]

I’m imagining something like this:

t = Test()
df['c'] = [t.params.set(**ab).c() for ab in df.to_dict(orient='records')]

Thanks!

Hoxbro · January 25, 2023, 8:26am

The method for updating multiple parameters is X.param.update(), see here. This function only return None. So if we want it in a list comprehension, we have to be a bit creative:

df["c"] = [t.c() for ab in df.to_dict(orient='records') if t.param.update(**ab) is None]

Marc · January 26, 2023, 7:21am

Hi Both

For improved readability I would consider creating a helper function or method updated_c. For example like

import pandas as pd
import numpy as np
import param as pm

df = pd.DataFrame(np.random.randn(30, 2), columns=['a','b'])

class Test(pm.Parameterized):
    a = pm.Number()
    b = pm.Number()
    
    def c(self):
        return self.a + self.b

    def updated_c(self, **ab):
        self.param.update(**ab)
        return self.c()

calculator = Test()

df['c'] = [calculator.updated_c(**ab) for ab in df.to_dict(orient='records')]
print(df)

LinuxIsCool · February 21, 2023, 5:28pm

Thank you both!

LinuxIsCool · April 23, 2023, 6:20am

Came to this use case again, and found myself preferring a vectorized (faster) version of @Hoxbro’s solution:

df["c"] = df.apply(lambda row: t.c() if t.param.update(**row) is None else None, axis=1)

I think this is a very powerful pattern for producing datasets based on parameter values.

Cheers