Managing user sessions and server filesystem access

BeeHaive · August 23, 2023, 6:51pm

Hello!

I’m trying to set up a service, where multiple users will be able to create their own dashboards through notebooks. These notebooks will be developed on a shared JupyterHub server, which is the same server I intend to host the dashboards from.

When developing these notebooks, the users have access to the server filesystem. Each dashboard will be visible to the whole community.

My question is:

What is an effective way of limiting access to files on the server for each client/user connecting to the dashboard?

For example:
User 1 is in a group with access to files in folder A on the server.
User 1 creates a dashboards that relies on folder A.
User 2 is in the same group, and therefore he is able to view the dashboard contents.
User 3 does not have filesystem permissions to folder A, so he will not be able to view it.

I’m hosting the panel in a docker container with root access.

Avenues I have considered include:

Using os.setuid for each user session
– this is so far unsuccessful, because I’m not entirely sure how processes and threads are created/deleted for each user. Ideally for this to work, I would have to manage each client session on separate processes - is that an option?
Launching a new application instance in a docker container for each connecting user
– I would love to avoid this

I hope this is the right forum for this sort of question. I’m also happy to clarify my problem more if it’s a little vague.

Kind regards

Marc · August 24, 2023, 4:48am

I believe you would need to know who the user is. You can use Basic Configuration or OAuth. See Configuring Authentication. See also Accessing User information

You will then need

a map of groups and users as well as
a map of dashboards and groups.

The could be defined simply as dictionaries. It might also be provided by the identity management system in your company. For example if you’re a Microsoft company then you would be using Active Directory and probably define AD groups to keep track of who can access the File Shares or Dashboards.

Then in your application code you can check if your user is a member of a group that can access the dashboard. If yes you run the application code. If not you run some code that informs the user he cannot access the dashboard.

BeeHaive · August 24, 2023, 7:55am

Hi Marc,

Thanks a lot for your response!

I like your solution a lot, and this is definitely what I would go with if I was more in control of the code and dashboards.

The problem is that I won’t be managing the applications people will be writing. One solution would of course be to wrap all IO operations in an LDAP/AD check to see if permissions match, but I can’t guarantee that people are using these wrappers without some pretty sophisticated code analysis tools or IO middleware.

So I would like to restrict access based on the file system - maybe this way they could even access their home folders and store things here directly?

I already know who is accessing the application via the headers (it’s set up with authentication via apache).

Is it feasible to spawn a subprocess for each session instance, or should I abandon this idea?

BeeHaive · September 6, 2023, 3:17pm

I also asked this question on Bokeh:
Bokeh discourse