Python Dash Web App Tutorial

Sharing is caring!

Last Updated on July 14, 2022 by Jay

In this tutorial, we’ll use Python Dash to create an interactive web application that will update graphs based on user input.

We are going to make a simple data visualization app for historical covid cases in each state of the US. And we’ll go through the following step by step:

  1. Make a web app with a simple layout
  2. Get data for the web app
  3. Make the web app interactive

What is Dash?

Dash is a Python library for data visualization on the web. The library is developed and maintained by the same team that created plotly, so sometimes you might hear people call it “plotly dash“. Similar to its sibling plotly.py, dash is written on top of plotly.js and react.js. It means that we can create beautiful data visualization (thanks JS!) web apps with our favorite language – Python.

The three tools from the plotly family each serve a different purpose, and they integrate really well together. We’ll use either plotly or the express version to make a graph, then use dash to create the web server, arrange the layout for display and add interactivity between the users and app.

plotly.expressQuick charting, exploratory data analysis
plotly.graphic_objectFull customization and control of the plotly library
dashInteractive chart on the web (current tutorial)
plotly family libraries

Why Web App?

Imagine we created an awesome data visualization in Python, and are excited to share it with our friends or colleagues. Do you just take a screenshot and send that to them? Our visualization is fully interactive so sending them a static picture makes no sense. Then do you send them the Python script, but what if they don’t know how to run Python?

The solution lies with Web App, which is really just a dynamic website that people can interact with. The advantage of a web app is you can put the visualization on the Internet, so people can access it from anywhere on almost any computer device.

We’ll use pip to get dash and plotly.

pip install dash plotly

Covid Data Visualization

Below screenshot is what the final product looks like:

Dash web app for historical covid cases

To start, let’s import both plotly and dash.

import dash
from dash import html
from dash import dcc
import plotly.express as px

The html is a class for writing HTML code, and dcc is the “dash core components” class, which is used to model dynamic contents such as interactive graphs, dropdowns, etc.

A dash web application contains two major parts: the app layout, and the interactive features of the application. We’ll first create the layout for this web app, and then in the next part, we’ll show how to add the interactivity between the app and users.

Dash App Layout

First of all, we need a Dash object. Then we need to edit the layout, which describes how the application will look like. This is where HTML knowledge will help. Because we are basically creating HTML components (using Python). Each HTML element will show up in the same order we type in the Python code.

Using the html object, we can create HTML components such as div, headings, tables, etc.

app = dash.Dash()
app.layout = html.Div(
                      [html.Div('Hello World From Dash.'),
                       html.H1('H1 tag here'),
                       html.Div(dcc.Dropdown(id='dropdown', 
                                             options = [{'label':'california', 'value':'california'},
                                                        {'label':'illinois', 'value':'illinois',     
                                                        {'label':'new york', 'value':'new york'}]
                                            )
                               )
                     )

app.run_server(debug=True)

Then using the dcc.Dropdown object we can create a dropdown box with 3 values (california, illionis, new york).

Then we’ll call the app.run_server method which is used to create and start a flask web server in the backend.

To start this web app, we can simply press F5 or if you are using a virtual environment, you can type the command “Python” then the script name in the command line.

Either way will work, briefly, we should see a message “dash is running on this address 127.0.0.1” which is also referred to as the localhost so we can either just copy and paste this IP address into a web browser, or we can use localhost to replace the IP address, which will also work.

Setting the debug = True will make our coding process easier, basically, every time we make a coding change (and save the code), we don’t have to stop and relaunch the app, just hit refresh and we should see the change reflected on the app.

For now, the app is just a static page. Whenever we need to stop the web app, just go to the console screen and press ctrl+c. You might need to press it a few times to stop the webserver.

Get Data

For this part, we are going to make a chart using plotly to show the daily covid cases. First thing, we need to get the data from the Johns Hopkins University covid GitHub repository. https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv

The source data is too big to display on GitHub. However, if you click on “View Raw”, it will take you to the following URL. https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv

This is essentially a csv file, which means we can use pandas to read the data into Python.

We can use the groupby function to aggregate data by state, the argument as_index = False means not to use the province_state as a new index, then we want to sum up everything for a given state.

The resulting table will give us only state-level case counts.

There’s still a problem – all the case counts are still in separate columns and this is going to make plotting difficult.

df = pd.read_csv(r'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')

df_state_lvl = df.groupby('Province_State', as_index=False).sum()

We need to have the case counts into a single column, with the corresponding dates in another column. Something like this:

Data format for plotly chart

It’s sort of like unpivoting the header row in the original dataframe, such that we can have all the dates and case counts going down vertically instead of going horizontally.

The pandas melt function does exactly that – we pass in the state level dataframe, keep the province_state column, then unpivot the date columns by feeding all those columns (as a list) into the value_vars argument.

df_melt = df_state_lvl.melt(id_vars=['Province_State'], value_vars=df_state_lvl.columns[(df_state_lvl.columns.str[-2:] == '21') | (df_state_lvl.columns.str[-2:] == '20')])

This line here value_vars=df_state_lvl.columns[(df_state_lvl.columns.str[-2:] == '21') | (df_state_lvl.columns.str[-2:] == '20')] looks at the last two characters of each column name. If they are either ’20’ or ’21’, meaning the year 2020 or 2021, then we know it’s a date column that contains the covid case counts. We want to keep only the date columns and remove everything else.

The unpivoted table will look like the screenshot above. The variable column contains the date, and the value column contains the daily count.

We are done with preparing data, and next, let’s plot it.

Make A Plotly Chart

We are going to display just one state for now, otherwise, the chart will be very crowded.

fig = px.line(df_melt.loc[df_melt['Province_State'] == 'California'], x='variable', y = 'value')
fig.show()
California Daily Covid Case Count – Cumulative

The chart looks good, next step let’s add all the state names into the dropdown in the dash app layout. We can find all the state names by using the unique method. Then we’ll create a list of dictionaries with the label and value pair. With this, we just need to pass this list of dictionaries into the dcc.Dropdown component.

ops = df_melt['Province_State'].unique()
labels = [{'label':i, 'value':i} for i in ops]
fig = px.line(df_melt.loc[df_melt['Province_State'] == 'California'], x='variable', y = 'value')

app.layout = html.Div([html.Div('Hello world from dash updated.'),
                       html.H1('H1 tag here'),
                       html.Div(dcc.Dropdown(id='dropdown', options = labels)),
                       dcc.Graph(id='fig1', figure=fig)])

Link The Dropdown Values With The Graph

As of now, there’s no connection between the dcc.Dropdown and the graph, but our goal is to link them together so when we select a state in the dropdown, the graph will also update accordingly.

This magic is done using dash callback functions, which are automatically called by dash whenever a user provides an input to update some property in another component. it means that users can interact with the graph by adjusting the dropdown values.

The way to implement callback functions in dash is simple. We need to:

  1. Write a function to update a component (graph, text, etc)
  2. Use the app.callback decorator to connect the function in 1) with the component we want to update

First, we need to import two other objects called Input and Output from the dash library. note it’s from dash.dependencies as opposed to just dash

from dash.dependencies import Input, Output

The Input object refers to the stuff that a user is going to change, in this case, the state name from the dropdown box.

the Output object refers to the things that should be updated, which is the graph.

@app.callback(Output('fig1', 'figure'),
              Input('dropdown', 'value'))
def update_graph(state):
    df_state = df_melt.loc[df_melt['Province_State'] == state]
    fig = px.line(df_state, x = 'variable', y ='value', title = f'{state} cumulative case counts')
    return fig

For the decorator @app.callback():

  1. The first argument is the Ouput, and in the Output, the first argument is the ID of the element, going back to our webpage, this is going to be the graph, or the fig1 and we assigned earlier. then we specify the output type is a figure
  2. The second argument in the callback function is the Input object, similar to the Output, the first argumnet for Input is the ID of the element, which is the dropdown , and the data type is a value)

Immediately following the decorator, we write a function to actually do the update. The function name doesn’t really matter, I’m just going to call it update_graph, but you can call it anything you want.

It must have some arguments, and the number of arguments depends on the number of input objects we have in the callback decorator. In our case, we only have 1 input, so we just need one argument for this function, the name of the argument also doesn’t matter, but I’m going to call it state, cuz that’s what the data is.

Basically, this function will filter data based on the given state, then regenerate a new figure using data for just that state.

We can also add a title that will also update the state name as we update the dropdown box.

At the end of the function, we must return some data, which is a plotly figure object.

How does the callback mechanism work?

plotly dash callback mechanism
  1. When we select illinois from the dropdown, then that value is going to get passed into the callback decorator as the Input, which is further passed into the argument state of the callback function.
  2. The callback function update_graph does its thing to update stuff, of course we can write a callback function to update pretty much anything on the web page, it just happens to be the chart in our example.
  3. Once the callback function completes the data update, it will return that data, a figure object to the callback decorator Output, which will then update the graph 'fig1' on the website.

That completes the full cycle of the interactive feature of dash.

Now you know how to make interesting data visualizations, what will you create next?

Putting It All Together

import dash
from dash import html,dcc
import pandas as pd
import plotly.express as px
from dash.dependencies import Input, Output


df = pd.read_csv(r'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')

df_state_lvl = df.groupby('Province_State', as_index=False).sum()

df_melt = df_state_lvl.melt(id_vars=['Province_State'], value_vars=df_state_lvl.columns[(df_state_lvl.columns.str[-2:] == '21') | (df_state_lvl.columns.str[-2:] == '20')])

ops = df_melt['Province_State'].unique()
labels = [{'label':i, 'value':i} for i in ops]
fig = px.line(df_melt.loc[df_melt['Province_State'] == 'California'], x='variable', y = 'value')


app = dash.Dash()
app.layout = html.Div([html.Div('Hello world from dash updated.'),
                       html.H1('H1 tag here'),
                       html.Div(dcc.Dropdown(id='dropdown', options = labels)),
                       dcc.Graph(id='fig1', figure=fig)])

@app.callback(Output('fig1', 'figure'),
                            Input('dropdown', 'value'))
def update_graph(state):
    df_state = df_melt.loc[df_melt['Province_State'] == state]
    fig = px.line(df_state, x = 'variable', y ='value', title = f'{state} cumulative case counts')
    return fig

app.run_server(debug=True)

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *