Data Dashboard in Practice

Motivation

Approach dashboard building in a structured way, so that building new dashboards is just following a recipe and it can thus scale up well.

Unpivot Structure

At its base its the idea of unpivoting a several columns table into a 3 fixed columns table for any metric.

The Idea

Transform:

date users clicks money ranking
1 50 100 12EUR 1

into:

date metric metric_value
1 users 50
1 clicks 100
1 money 12EUR
1 ranking 1

This approach forces a fixed column structure and thus it becomes consistent to manipulate and easy to add new metrics.

To add interaction use for example Excel slicer on the metric, and plot the metric_value by date.

We end up with a report interface that shows many views in a compact way. For example 1 chart and 1 table + slicers to switch between dimensions, can show a lot of data in a small space.

Also this approach makes it easier to add dimensions into a report without having to change the report format considerably.

In reality we might want to add more columns as extra useful dimensions, like country, type, etc…

But once you keep using this structure as a base, there’s whole lot of methods developed for one report that you can then reuse for others.

De-duplication Problem

When including several breakdowns like country, be aware that some metrics can’t be summed up to get total, for example users in country. Imagine that same user has been in 2 countries in the same month, that in total sum would show up 2 times, inflating the number artificially.

To avoid the de-duplication problem, calculate the de-duped number upfront (in SQL) and include it in the data extract before going into dashboards.

Example:

Imagine in same month I’ve traveled between Estonia and Portugal, then data extract should look like:

date metric metric_value country
month1 users 1 Portugal
month1 users 1 Estonia
month1 users 1 All

Data Querying (SQL)

The Date Dimension

Pull in an external table with all the date calculation needed and join on the day for example (or some attribute that makes sense). No need to keep rebuilding data functions for each data set, just need to make sure can join the data on some attribute.

For example a calculated date dimension: http://www.ipcdesigns.com/dim_date/

Doing the Unpivoting

In Excel: PowerQuery -> Unpivot

In Python: pandas.melt

MoM calculations

“Month on Month Growth” in power pivot DAX language:

Last Month:=CALCULATE(SUM(‘MYDATA’[metric_value]))

Prev. Month:=CALCULATE(SUM('‘MYDATA’[metric_value]),DATEADD('‘MYDATA’[Month],-1,MONTH))

MoM Change:=IF([Prev. Month],(CALCULATE(SUM('‘MYDATA’[metric_value]))-[Prev. Month])/[Prev. Month],BLANK())

Assorted Tips


comments powered by Disqus