Deploying a Machine Learning model as a web application makes it easy for others with little or no programming experience to use it.
In previous tutorials, where I explained how I created a house price prediction app and a loan eligibility app, we made use of Streamlit. Streamlit is easy to use. This is why it is a popular choice for data scientists.
In a world where learning one framework isn’t enough, won’t it be nice to learn how to accomplish this using Django?
Understandably, Django can be tough for beginners. The only remedy though is constant practice. If you have been going through some of my tutorials on Django, there is no doubt that you have become familiar with the process.
Therefore, in this tutorial, we will add to your knowledge by creating a machine learning application using Django. Guess what we will used for prediction? The famous Titanic dataset!
Developing Machine Learning Model
The Machine Learning classification problem aims to predict the survival of the passengers based on their attributes. There are multiple steps involved in creating a Machine Learning model. To keep things simple, we will skip those steps and focus on showing how to implement Machine Learning in Django.
Create a folder for this project. Then, inside the folder, create a file named model.py
. This is where we will develop our model. You can download the dataset on my GitHub page.
import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import MinMaxScaler from sklearn.linear_model import LogisticRegression import pickle df = pd.read_csv('titanic.csv', index_col=0) # selecting the features we need df = df[['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked', 'Survived']] # encoding the column to a numberic value df['Sex'] = df['Sex'].map({'male': 0, 'female': 1}) # converting the Age column to numberic type df['Age'] = pd.to_numeric(df.Age) # filling the null values df['Age'] = df.Age.fillna(np.mean(df.Age)) # creating additional features from Embarked columns after converting to dummy variables dummies = pd.get_dummies(df.Embarked) df = pd.concat([df, dummies], axis=1) df.drop(['Embarked'], axis=1, inplace=True) X = df.drop(['Survived'], axis=1) y = df['Survived'] # scaling the features scaler = MinMaxScaler(feature_range=(0,1)) X_scaled = scaler.fit_transform(X) model = LogisticRegression(C=1) model.fit(X_scaled, y) # saving model as a pickle pickle.dump(model, open('titanic.pkl', 'wb')) pickle.dump(scaler, open('scaler.pkl', 'wb'))
After importing the CSV file, we saw that there are missing values in the features selected. We simply fill them up with the mean of the values. We convert the Embarked column to a dummy variable. Then, we add them to the features.
We are using LogisticRegression
as the Machine Learning algorithm to make this prediction. Finally, we save the model as a pickle object to be used later.
Creating Django Project
As covered in previous tutorials, the steps to set up a new Django project are as follows:
- create and activate a virtual environment
- install required libraries
- create a new Django project which we will call predictions
- create a
requirements.txt
file - create a new app called
titanic
- perform a migration to set up the database
- update
settings.py
$ python3 -m venv venv $ source venv/bin/activate (venv) $ pip install django tzdata scikit-learn pandas numpy (venv) $ pip freeze > requirements.txt (venv) $ django-admin startproject venv_predictions . (venv) $ python manage.py startapp titanic (venv) $ python manage.py migrate (venv) $ python manage.py runserver
The (venv)
indicates that you are in the virtual environment. Don’t forget to include the dot which signifies creating the project in the current directory. You will see the image below if everything was installed successfully.
Open the settings.py
file to let Django know that a new app is created. We will do so in the INSTALLED_APP
section.
INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', # custom app 'titanic', ]
Let’s configure the URLs for our website. Open the urls.py
in the project folder and make it appear like this:
from django.contrib import admin from django.urls import path, include urlpatterns = [ path('admin/', admin.site.urls), path('', include('titanic.urls')), ]
Let’s now configure the URLs for the app. Create a urls.py
in the app folder.
from django.urls import path from .views import home, result urlpatterns = [ path('', home, name='home'), path('result/', result, name='result'), ]
Let’s create another file in the app folder named predict.py
. This is where we will use the pickled files to make predictions.
import pickle def getPrediction(pclass, sex, age, sibsp, parch, fare, C, Q, S): model = pickle.load(open('titanic.pkl', 'rb')) scaled = pickle.load(open('scaler.pkl', 'rb')) transform = scaled.transform([[pclass, sex, age, sibsp, parch, fare, C, Q, S]]) prediction = model.predict(transform) return 'Not Survived' if prediction == 0 else 'Survived' if prediction == 1 else 'error'
The function contains the exact number of features used to train the model.
Notice we first transform the values the user will input on the web page. Since the trained model was transformed, we have to do the same to any values the user enters. Finally, we return the results based on the prediction.
Alright, let’s head over to the views.py
file in the app folder.
from django.shortcuts import render from .prediction import getPrediction # Create your views here. def home(request): return render(request, 'home.html') def result(request): pclass = int(request.GET['pclass']) sex = int(request.GET['sex']) age = int(request.GET['age']) sibsp = int(request.GET['sibsp']) parch = int(request.GET['parch']) fare = int(request.GET['fare']) embC = int(request.GET['embC']) embQ = int(request.GET['embQ']) embS = int(request.GET['embS']) result = getPrediction(pclass, sex, age, sibsp, parch, fare, embC, embQ, embS) return render(request, 'result.html', {'result': result})
The home()
function simply renders the home.html
which contains the form where our users will input some details. Then the result()
function will get those details, make predictions, and renders the prediction result.
Notice that we made sure every detail corresponds to the features used in training the model.
The final step is templates. Create a folder bearing the name. Make sure you are doing so in the current directory. Then register it in the settings.py
file.
TEMPLATES = [ { 'BACKEND': 'django.template.backends.django.DjangoTemplates', 'DIRS': [os.path.join(BASE_DIR, 'templates')], # add these 'APP_DIRS': True, 'OPTIONS': { 'context_processors': [ 'django.template.context_processors.debug', 'django.template.context_processors.request', 'django.contrib.auth.context_processors.auth', 'django.contrib.messages.context_processors.messages', ], }, }, ]
Don’t forget to import the required module. Inside the templates folder, create a home.html
file.
<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <meta charset="utf-8"> <title>Home</title> </head> <body> <h1 style= “color:blue”>Titanic Survival Prediction</h1> <form action="{% url 'result' %}"> {% csrf_token %} <p>Passenger Class:</p> <input type="text" name="pclass"> <br> <p>Sex:</p> <input type="text" name="sex"> <br> <p>Age:</p> <input type="text" name="age"> <br> <p>Sibsp:</p> <input type="text" name="sibsp"> <br> <p>Parch:</p> <input type="text" name="parch"> <br> <p>Fare:</p> <input type="text" name="fare"> <br> <p>Embark Category C:</p> <input type="text" name="embC"> <br> <p>Embark Category Q:</p> <input type="text" name="embQ"> <br> <p>Embark Category S:</p> <input type="text" name="embS"> <br> <input type="submit" value='Predict'> </form> </body> </html>
This syntax "{% url 'result' %}"
is Django templating language. It’s a link to the result.html
. Remember the name argument in the path()
function in urls.py
? That is another way to refer to the result.html
URL. The csrf_token
is for security reasons. It’s mandatory when a form is created.
Can you now see it’s from this form we get those names in the result()
function? It is here that the data will be collected and sent to the result()
function which processes the data, makes predictions, and displays the result in the result.html
.
We now create the result.html
file in the templates folder.
<!DOCTYPE html> <html lang="en" dir="ltr"> <head> <meta charset="utf-8"> <title>Result</title> </head> <body> <h1 style=“color:blue”>Prediction</h1> <p>Verdict: {{ result }} </p> </body> </html>
It’s very simple. The {{ result }}
is a variable that will be rendered to this web page. Go back to your result()
function to recall if you have forgotten.
That’s it. We are done. Thanks for staying with me so far in this tutorial. Let’s check what we have done. Run the local server.
That is the home page. Everything in the form is displayed. Enter details and make predictions. Remember, only numbers will be in the form. If you are lost, check the dataset using Pandas.
If you encounter an error saying, “No such file or directory: 'titanic.pkl'”
, you may have to manually run the model.py
to generate the file.
💡 Recommended: 5 Minutes to Pandas
Conclusion
Congratulations!! You are not only a data scientist but also a Django developer.
In this tutorial, we performed Machine Learning using Logistic Regression.
We demonstrated how to implement it using Django.
💻 Exercise: As an assignment, can you use what you have learned to make predictions on the iris dataset, the hello world of data science? Give it a try using Django.
You are welcome. 💪