How to refer to data uploaded with FileInput for further processing?

Hi, I am very new to Panel (and using widgets in general). I have the code below to read an excel file and convert it to a dataframe. I understand that the processed dataframe is stored in active_load_excel() if the file upload is success. So if I want to process this dataframe further in a separate cell in juypter, it works. However, if I want to do the followup process (e.g., to print out this dataframe) in the same jupyter cell as my file_input widget, it will always return as None because it runs simultaneously as the file_input and the file_input is always empty when I launch the program.

In my example, is there a way to tell the print statement to ‘wait’ until the file input is completed? And in a more general sense, how to refer to the data uploaded with FileInput for future processing? I would really appreciate it if anyone could provide the syntax for this.


file_input = pn.widgets.FileInput(accept=".xlsx")

def load_excel(data):
if data is not None:
df = pd.read_excel(io.BytesIO(data)
return df

active_load_excel = pn.bind(load_excel, file_input.param.value)
pn.Column(file_input, active_load_excel)

#the following print state will return None if run in the same cell as file_input
#but will return the uploaded excel dataframe if run in the next cell

print(active_load_excel())

There are a lot of ways to do it (I’d recommend using param to create a parameterized object), but one easy solution is to just make df a global variable:

file_input = pn.widgets.FileInput(accept=".xlsx")

def load_excel(data):
    if data is not None:
        global df
        df = pd.read_excel(io.BytesIO(data))
        return df

active_load_excel = pn.bind(load_excel, file_input.param.value)

print(active_load_excel())

pn.Column(file_input, active_load_excel)

Here is a (sloppy) example with param:

import panel as pn
import param
import pandas as pd
import io
pn.extension()

class Data(param.Parameterized):
    df = param.DataFrame()
    
    def loader(self):
        file_input = pn.widgets.FileInput(accept=".xlsx")
        
        @pn.depends(file_input, watch = True)
        def load_excel(finput=file_input, self=self):
            data = file_input.value
            if data is not None:
                self.df = pd.read_excel(io.BytesIO(data))
                print (self.df)
                
        return file_input

data = Data()

pn.Column(data.loader, pn.widgets.DataFrame.from_param(data.param['df'], height = 500))
1 Like

Thank you so much for providing the sample code and introducing me to param. Both of the code worked! I did notice that when I use param (your second code), the browse button for FileInput always displays “no file selected” within a few seconds after the file update (the actual file name will appear for a very brief time). I was wondering if this has anything to do with the ‘watch = True’, and if there’s a way to keep the file name.

Sure! That’s as simple as adding () after the reference to data.loader, so the last line would look like this:

pn.Column(data.loader(), pn.widgets.DataFrame.from_param(data.param['df'], height = 500))

I’m no expert, but my understanding is that “calling” the method (i.e. adding the ()) makes it more of a static view, whereas leaving it uncalled makes it more dynamic. Don’t quote me on this (literally).

1 Like

Hi @ygmeow,

Don’t quote me but I think you may have stumbled into an old issue perhaps bug but I haven’t been able to find anything on it other than recall reading somewhere. I’ve put an example below that demonstrates the behaviour, as you can see from the video if you run your example you get no output, then change to serve it works in tab, come back and change cell to original example it all starts to work. I don’t normally encounter this because I utilise different cells when working with jupyter lab so don’t see it normally. To sum up looks like an issue if just using one cell in jupyter, fine if serving app.

import io
import panel as pn
import pandas as pd
import openpyxl
import param

pn.extension()

xlsx_file_input = pn.widgets.FileInput(accept=".xlsx")

def load_excel_file(file_bytes):
    if file_bytes is not None:
        return pd.read_excel(io.BytesIO(file_bytes))
    
get_dataframe = pn.bind(load_excel_file, xlsx_file_input)
dash = pn.Column(xlsx_file_input, get_dataframe)

dash

Also note you don’t need to use a global with bind, it returns the value, so you should be able to get the dataframe by typing

df = get_dataframe()
df.head()
#process dataframe further
1 Like

Hi Carl, thanks for pointing out this behavior (nice video example BTW). I might have noticed that too but it was not always reproducible on my end. I thought it was some glitch with my jupyter notebook :sweat_smile:

I figured out a workaround with suggestion from @riziles and added a button that triggers all the functions that do the downstream processing.

Happy New Year!

1 Like