I have tried adding the cachetools TTLCache on top. So far this seems to speed things up even more.
So updating the _get_data of the above example with
from cachetools import cached, TTLCache
@cached(cache=TTLCache(maxsize=1024, ttl=600))
def _get_data(value):
value_str = str(value)
if value_str in cache:
return cache[value_str]
else:
print("loading data", value)
time.sleep(1)
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list("ABCD"))
cache[value_str] = df
print("data loaded")
return df
I will need to figure out how to get things working globally. Right now I think it’s working for a session.