How I Created Charts Using Highcharts.js in Django Web Application

5/5 - (1 vote)

Highcharts.js is a client-side JavaScript library used for data visualization. It can be included in Django templates just like any other JavaScript library using the link:

<script src="https://code.highcharts.com/highcharts.src.js"></script>

The hard work has already been done for us by Highcharts. Where we need to work on is finding a way to structure or format the data in a way that Highcharts can understand. In this tutorial, we will learn how to dynamically generate charts and have them rendered on a Django web application.

The Model

For this demonstration, we will use the famous Titanic dataset. Here is the model class:

SEX_CHOICES = (
    ('M', 'Male'),
    ('F', 'Female'),
)


PORT_CHOICES = (
    ('C', 'Cherbourg'),
    ('Q', 'Queenstown'),
    ('S', 'Southampton'),
)

class Passenger(models.Model):
    name = models.CharField(max_length=100, blank=True)
    sex = models.CharField(max_length=1, choices=SEX_CHOICES)
    survived = models.BooleanField(default=True)
    age = models.FloatField(null=True)
    p_class = models.CharField(max_length=10)
    fare = models.FloatField(null=True)
    embarked = models.CharField(max_length=1, choices=PORT_CHOICES)

    def __str__(self):
        return self.name

The dataset can be accessed on my GitHub page. Of course, there are more columns than the ones in the Passenger class. But these are the ones we will use.

In previous tutorials, I have worked with the Titanic dataset to create a Machine Learning web application that predicts the survival of passengers based on their attributes. We won’t be performing any data manipulation with the dataset in this tutorial.

We will only query the database to display the data using Highcharts so that we can see how Django interacts with Highcharts.

Populating the Database

Now that we have created the Model class and performed migrations (do it if you have not already done so), we need a way to populate the database with the Titanic dataset.

There are so many ways to do so. We can use the django-csvimport or django-adaptor library which allows us to easily transform a CSV file into a model instance.

Once we install the library, add it to the INSTALLED _APPS section, and run the migration command. The library will match the header line of the CSV file with the fields in the selected model and import any rows that include all required fields.

Another way to populate the database is to use the inbuilt csv module. This is the method we will use. Herein lies the steps we will follow in the Django shell:

  • Open the CSV file using Python’s built-in open() function
  • Create a csv.reader object to read the data from the CSV file
  • Iterate over the rows in the CSV file using a for loop
  • For each row, use the Django’s get_or_create method to create a new record in the database.

Run the code below on your terminal to open the Django shell.

python3 manage.py shell
with open('titanic.csv', newline='') as csvfile:
    reader = csv.reader(csvfile)
    next(reader)  # Skip the header row
    for row in reader:
        # Check if any of the values are empty
        if not all(row[2:5] + row[7:11]):
            continue  # Skip this row
        passenger = Passenger(
            name=row[2],
            sex=row[3],
            survived=bool(int(row[1])),
            age=float(row[4],
            p_class=row[0],
            fare=row[8],
            embarked=row[10]
        )
        passenger.save()

The code reads the data from the file, creates a new Passenger object for each row, and saves it to the database.

We use the all() function to check if all of the values in the columns that correspond to the fields in the Passenger model are not empty. If any of these values are empty, we use the continue statement to skip the current row and move on to the next row. This will skip any rows with missing values in any of these columns.

Having populated the database, let’s check the admin panel to see the results. Be sure to register the model in the admin.py file.

Can you see it? The database has been automatically populated with the Titanic dataset! It’s now time to proceed with the very task that forms the basis of this project.

Querying the database to display charts

As rightly stated, for Highcharts to interact with Django, the data must be formatted in a way it will understand. This can be done by querying the database. Therefore, in this project, we will query the database to:

  1. Display the number of Titanic survivors and death by ticket class.
  2. Display the number of Titanic survivors and death by gender.
  3. Display the number of Titanic survivors and death by age group.

Titanic Survivors by Ticket class

Go to the views.py file and write this code:

from django.shortcuts import render
from .models import Passenger
import json


def survivors_by_ticket_class(request):
    dataset = []
    for ticket_class in Passenger.objects.values_list('p_class', flat=True).distinct():
        survived_count = Passenger.objects.filter(p_class=ticket_class, survived=True).count()
        not_survived_count = Passenger.objects.filter(p_class=ticket_class, survived=False).count()
        dataset.append({
            'ticket_class': ticket_class,
            'survived_count': survived_count,
            'not_survived_count': not_survived_count
        })


    survivors = {
        'name': 'Survived',
        'data': [data['survived_count'] for data in dataset],
        'color': 'green'
    }

    death = {
        'name': 'Not Survived',
        'data': [data['not_survived_count'] for data in dataset],
        'color': 'red'
    }


    chart = {
        'chart': {'type': 'column'},
        'title': {'text': 'Titanic Survivors by Ticket Class'},
        'xAxis': {'categories': [data['ticket_class'] for data in dataset]},
        'series': [survivors, death]
    }

    dump = json.dumps(chart)

    return render(request, 'ticket_class.html', {'chart': dump})

We initialize an empty list and iterate over the distinct value of the p_class field in the Passenger model. Then, we count the number of passengers that survived and those that do not. Finally, we create a dictionary and append it to the dataset list. The dataset now becomes a list of dictionaries.

This is the result stored in the dataset list:

[{'ticket_class': '1', 'survived_count': 156, 'not_survived_count': 80},
 {'ticket_class': '2', 'survived_count': 16, 'not_survived_count': 5},
 {'ticket_class': '3', 'survived_count': 8, 'not_survived_count': 5}]

Remember, we skip some rows of the Titanic dataset. The chart configuration is done at the backend in the view function.

The survivors and death dictionaries represent the two data series that will be plotted on the chart. The name key specifies the name of the series, the data key specifies the data values for the series, and the color key specifies the color of the columns for the series.

The chart dictionary contains the overall configuration for the chart. The 'chart': {'type': 'column'} key-value pair specifies that this is a column chart. The 'title': {'text': 'Titanic Survivors by Ticket Class'} key-value pair specifies the title of the chart.

The 'xAxis': {'categories': [data['ticket_class'] for data in dataset]} key-value pair specifies the categories for the x-axis of the chart, which in this case are the ticket classes. Finally, the 'series': [survivors, death] key-value pair specifies that the survivors and death data series should be plotted on the chart.

The json.dumps() function is used to convert the chart dictionary into a JSON-formatted string. This is necessary because the data needs to be in a format that can be easily passed to the template and rendered in the browser.

Here is the template:

<!doctype html>
<html>
<head>
  <meta charset="utf-8">
  <title>Django Highcharts Example</title>
</head>
<body>
  <div id="container"></div>
  <script src="https://code.highcharts.com/highcharts.src.js"></script>
  <script>
    Highcharts.chart('container', {{ chart|safe }});
  </script>
</body>
</html>

The {{ chart|safe }} code is using the safe keyword to mark the chart variable as safe for rendering as HTML. This is necessary because the chart variable contains a JSON-formatted string that needs to be rendered the way it is written for the charting library to correctly parse and use it.

Without the safe keyword, any special characters in the chart variable would be escaped, which would prevent the charting library from correctly parsing the data and creating the chart.

Titanic survivors by gender

from django.db.models import Count, Q
from .models import Passenger

def survivors_by_gender(request):
    dataset = Passenger.objects \
        .values('sex') \
        .annotate(survived_count=Count('sex', filter=Q(survived=True)),
                  not_survived_count=Count('sex', filter=Q(survived=False))) \
        .order_by('sex')

    survivors = {
        'name': 'Survived',
        'data': [data['survived_count'] for data in dataset],
        'color': 'green'
    }

    death = {
        'name': 'Not Survived',
        'data': [data['not_survived_count'] for data in dataset],
        'color': 'red'
    }

    chart = {
        'chart': {'type': 'column'},
        'title': {'text': 'Titanic Survivors by Gender'},
        'xAxis': {'categories': [data['sex'] for data in dataset]},
        'series': [survivors, death]
    }

    dump = json.dumps(chart)

    return render(request, 'gender.html', {'chart': dump})

This function uses the provided code to create a dataset of the number of survivors and deaths by gender. In Django, the annotate() method is used to calculate aggregate values for each item in a queryset. It allows you to add additional fields to the queryset that represent the result of an aggregation, such as a count, sum, average, minimum, or maximum.

`Q` is a class in Django that allows you to create complex lookups by combining multiple query expressions using logical operators such as `&` (and), `|` (or), and `~` (not). It is often used in conjunction with the `filter()` method to create more complex queries.

The rest of the code is similar to the previous function, but with some changes to reflect the fact that the data is now grouped by gender instead of ticket class.

Check my GitHub page for the templates and the app-level URLs. They are basically the same.

Titanic survivors by age

Back to the views.py file.

from django.db.models import  F, ExpressionWrapper, IntegerField
def survivors_by_age(request):
    dataset = Passenger.objects \
        .annotate(age_bin=ExpressionWrapper((F('age')/10)*10, output_field=IntegerField())) \
        .values('age_bin') \
        .annotate(survived_count=Count('age_bin', filter=Q(survived=True)),
                  not_survived_count=Count('age_bin', filter=Q(survived=False))) \
        .order_by('age_bin')

    survivors = {
        'name': 'Survived',
        'data': [data['survived_count'] for data in dataset],
        'color': 'green'
    }

    death = {
        'name': 'Not Survived',
        'data': [data['not_survived_count'] for data in dataset],
        'color': 'red'
    }

    chart = {
        'chart': {'type': 'column'},
        'title': {'text': 'Titanic Survivors by Age Group'},
        'xAxis': {'categories': [f"{int(data['age_bin'])}s" for data in dataset]},
        'series': [survivors, death]
    }

    dump = json.dumps(chart)

    return render(request, 'age.html', {'chart': dump})

The F class is used to reference fields on a model within a query. In the code above, we are referencing the age field. ExpressionWrapper is a class that allows you to wrap a database expression with a specific output field. It is used to perform calculations on a field and use the result in a query.

In the code, ExpressionWrapper((F('age')/10)*10, output_field=IntegerField()) is used to create an expression that calculates the age group for each passenger.

IntegerField is a class that represents an integer field in a database. It is used as the output_field argument for the ExpressionWrapper to specify that the result of the expression should be treated as an integer.

Those three classes are provided by Django ORM to help with constructing complex database queries.

We can go on and on to create complex queries. Check my GitHub page to see more views created to:

Display Titanic survivors by location

Average age by ticket class

Total fare paid by passengers in each ticket class

The minimum and maximum age of passengers in each ticket class

Conclusion

In this tutorial, we have learned how to use the Django ORM to create complex queries that aggregate data from a model. We have also learned how to use this data to create charts using Highcharts.js.

Specifically, we learned how to populate our database with data from a CSV file. We have created several functions that generate charts showing various statistics about the Titanic passengers, such as the number of survivors by ticket class, gender, and port of embarkation, as well as the average age and total fare paid by ticket class.

Overall, this tutorial has demonstrated the power and flexibility of the Django ORM and shown how it can be used to create data-driven visualizations with Highcharts.js.