Over this summer, our team of four undergraduate students in Lehigh University’s Mountaintop Program have developed a module using Bokeh to teach users how to create a simple machine learning model. This is a continuation of the overarching STEM Visualizations project that our advisor has posted about before here.
With this module, Biodegradability Classification, the goal is to teach a beginner how to build a classification model using an intuitive interface, all using Bokeh. The user is guided through a five-step process:
- Prepare Data: the user can choose their data, set the train-validate-test split, and explore the dataset through the responsive histogram and datatable.
- Train: the user can choose one of three algorithms to train and view its performance through an accuracy display and learning curve plot.
- Validate: the user can modify the hyperparameters of the model and view the effects through the same accuracy display and learning curve.
- Test: the user can perform a final evaluation on one of their models from steps (2) and (3), viewing results on the accuracy display and hoverable confusion matrix. They may also export the results of the test (with prediction types) to an .xlsx or .csv file. Additionally, they can compare the precision-recall curves of up to two of their saved models.
- Predict: lastly, the user can use any of their models from (2) and (3) to attempt to predict the biodegradability class of any molecule. The user can draw a molecule using the embedded tool (from PubChem), then copy and paste the SMILES string into the TextInput widget.
This module was made using Bokeh, scikit-learn, and various chemistry-related libraries. Mostly working independently from our mentors, we relied greatly on the extensive Bokeh documentation to create this app.
Our team of undergraduates consists of:
Eric Burton (LinkedIn / Github)
Sophia Pham (LinkedIn / Github)
Devin Pombo (LinkedIn / Github)
Mathurtion Rajendrackumaar (LinkedIn / Github)
And our mentors:
Srinivas Rangarajan (Website)
Raghuram Thiagarajan (LinkedIn / Github / @swamilikes2code)
Joseph Menicucci (Bio)
All of the STEM Visualizations modules can be found on our website.
We are looking to collect student feedback to further improve our module and showcase the usage of inquiry-based learning in engineering classrooms through a paper.