Interactive results with jupyter notebooks
From this article, you will understand how to build interactive results for your machine learning solutions on jupyter notebook using ipywidgets.
In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercion, etc. But that's one of the things I like about Python. –Tim Peters
In most cases, non-software developers or IT, diving into data science face challenges on how they can simply communicate their results, but this is also a common challenge to machine learning newbies. Making machine learning solutions testable may be challenging to a large group using python frameworks like Django and Flask. This is because they do require someone to have web development skills.
But nowadays the technology is evolving too fast, there are multiple platforms that simplify the work of making models live over the clouds such platforms are streamlit, gradio, and Gooey.
Today I will be taking you through the common environment for developing a machine learning model for python users and show how you can develop interactive results on a jupyter notebook.
What is Jupyter Notebook?
An open-source web application that you can use to create and share documents that contain live code, equations, visualizations, and text.
The Jupyter notebook is an interactive notebook allowing you to write documents with embedded code, and execute this code on the fly. It was originally developed as a part of the Ipython project, and could only be used for Python code at that time. Nowadays, the Jupyter notebook integrates multiple languages, such as R, Julia, Haskell and much more – the notebook supports about 50 languages.
It allows the user to download the notebook in various file formats like PDF, HTML, Python, Markdown, or a .ipynb file.
Simple interactive results of ML models on Jupyter Notebook
I assume the readers are comfortable with python and Machine Learning concepts, let's take regression challenges that require predict the medical insurance costs of an individual based on age
, gender
, Body mass index (kg / m ^ 2)
, smoking behavior
and Number of children covered by health insurance / Number of dependents
. More information about the datasets used to build this solution can be found here.
The datasets has 7 columns, i opt to use 5 features(X) and charges
column as out target
Let's use Linear Regression
to create a regression model to predict medical insurance charges, but we have object features like gender and smoker to make things easier let's go with the sklearn pipeline which helps us to define all processes required for our regression model to learn from observations.
from sklearn.pipeline import make_pipeline
from category_encoders import OneHotEncoder
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
We will be using make_pipeline
to wrap all processes, OneHoteEncoder
to preprocess all object features[sex
, smoker
] to a numeric format so that our regressor can understand, LinearRegression
for building forecasting models and mean_absolute_error
for evaluating the performance of our model.
Here is how we should define our pipeline
# %building model pipeline
model = make_pipeline(
OneHotEncoder(use_cat_names=True),
LinearRegression()
)
After defining our pipeline, now we should train our model by fitting with features(X) and labels(target) in the pipeline.
# % training model to forecast medical costs
model.fit(X,target)
Then, we can evaluate our model to see the performance
# evaluating model
y_pred_training = model.predict(X)
print("Training MAE:", mean_absolute_error(target,y_pred_training))
wow! we have built our simple regressor model, now it is time to focus on how to make interactive results with our model?
Does our model require to be hosted in the cloud in order to interact with it?
If your answer is NO, your right, we can create an interactive environment even within the Jupyter notebook using a python library that offers a couple of jupyter widgets for interactivity known as ipywidgets.
Ipywidgets also known as jupyter-widgets or simple widgets, are interactive HTML widgets for Jupyter notebooks and the IPython kernel. Notebooks come alive when interactive widgets are used. Users gain control of their data and can visualize changes in the data.
With ipywidgets learning becomes an immersive, fun experience. Researchers and developers can easily see how changing inputs to a model impact the results.
Then without further due, let's install and import all ipywidgets and then consume it, but for the latest version of Jupyter notebooks no need for installation you can simply import and consume.
pip install ipywidgets
# %import ipywidgests
import ipywidgets as widgets
Then, let's create our function to perform predictions depending on the five features we used for model training.
def make_prediction(age, gender, bmi, children, smoker):
"""
Function to perform predictions of (medical costs)insurance charges depending on:
1. age - age of primary beneficiary
2. sex - insurance contractor gender, female, male
3. bmi - Body mass index (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9
4. children - Number of children covered by health insurance / Number of dependents
5. smoker - smoking or not
"""
df = pd.DataFrame({"age":age,"sex":gender,"bmi":bmi,"children":children, "smoker":smoker},index=[0])
prediction = model.predict(df).round(2)[0]
return f"Predicted total insurance cost: ${prediction}"
Let's try to invoke our make_prediction
function with the expected arguments
make_prediction(39,"male", 20, 3, "yes")
# outputs
# 'Predicted total insurance cost: $29564.47'
Wow! great it is time to insert our function to ipywidgets
so that it can be flexible to interact with. ipywidgets contain multiple widgets but for our case let's consume only 4 widgets.
The first is interact
a widget for wrapping our function and all required inputs.
The second is IntSlider
a widget for controlling any integer inputs for our model.
The third is FloatSlider
a widget for controlling any float inputs for our model.
The last is Dropdown
a widget for controlling any inputs that require a user to select from multiple options.
widgets.interact(
make_prediction,
age=IntSlider(
min=X["age"].min(),
max=X["age"].max(),
value=X["age"].mean(),
),
gender = widgets.Dropdown(options=sorted(X["sex"].unique())),
bmi= widgets.FloatSlider(
min=X["bmi"].min(),
max=X["bmi"].max(),
step=0.01,
value=X["bmi"].mean(),
),
children= widgets.IntSlider(
min =X["children"].min(),
max =X["children"].max(),
value=X["children"].mean(),
),
smoker= widgets.Dropdown(options=sorted(X["smoker"].unique())),
);
You can see the interactive sliders and dropdowns that make flexible ways to input and make predictions.
This is a simple way to test how the model you built can perform predictions of user inputs before deploying on the cloud. Also, if you are teaching someone about building machine learning solution it will be simple to understand how a user can interact with the final solution.
It look's simple but it has much impact in understanding Machine Learning , feel free to share this article with the community.