Vividdata Visualization

Visualizing Complexity in Hockey Data

At Vividdata Visualization we built the Hockey Complexity Visualization as a tool to discuss aspects of handling complexity. It is intentionally busy looking.

It is an interactive data visualization that uses the D3.js javascript library that quickly and effectively links data to graphic elements in a web browser.

Basically, this visualization shows in detail how all NHL teams have progressed through the 2019-20 season, including the qualifying and playoff rounds, which resumed in August of 2020.

Click for interactive D3 visualization.

Here is a way to understand what’s going on in this viz. When you analyze a visualization, to understand it you must at the least understand these four basic things.

First, identify the number of dimensions and the specific data involved.  This is an eight-dimensional visualization because it brings together eight separate variables in to one complex graphic. 

  1. Team cumulative points which morphs into team’s position after August. This complex variable is treated as a single dimension in this visualization
  2. League conference is another variable used to contextualize the teams’ positions during the playoffs. (Division is another variable that is revealed later when interacting with the buttons.)
  3. Game outcome is the variable representing the most granular measured event. While each hockey game, along with its details, is what we’d call the “unit of observation”, while the “unit of analysis” is the game outcome for each team. Each game having two related outcomes. The three next variables are attributes of a game outcome:
  4. Game location or where the game is played for that team, home or away. 
  5. Game result, whether it was a win, loss or an overtime loss for the team.
  6. Game point spread, the difference between the final scores, how much the team won (or lost) by.
  7. Game date is a main variable that is shown along the X axis.
  8. Team is another main variable that is reflected in the individual lines.

Second, you have to understand how each data variable has been measured. In the data science and data visualization professional practices, too many people don’t fully understand measurement levels or why they are important. In this hockey example:

  • one data variable is ratio-level which is transformed into ordinal data 
  • two others are interval
  • another is ordinal
  • two are categorical, and 
  • two are binary

See if you can identify which variables are which of those types. For any visualization you would be doing very well if you can identify which variable is at which level and explain why that level is appropriate and valid. 

Third is that you need to be able to identify the analytical structure, which is implicit in each and every visualization. This visualization is basically a 3-dimensional X-Y-and-lines with additional dimensional data integrated in different ways. This approach is valid given certain data measurement levels and types of data.

And finally fourth, and most importantly, for each variable you must understand if and to what extent the visualization is focusing your attention on specific fundamental qualities of quantitative data: i.e., its magnitude, central tendency or variability. 

Over the decades business graphics had been highly focused on central tendency,  e.g., averages, rates, ratios, percentages, trends etc. These are all generalizations and information “reductions” that reflect only a portion of the full information content of the data.

Current visualization technologies allow greater visual detail so that we can instead (and sometimes also) show magnitudes and variabilities in data, because those requires a more complex view, which the technologies now enable.

The Hockey Complexity visualization may look out of the ordinary because it shows high detail in the game outcomes of the regular- and post-season, which express its qualities of magnitude and variability. But by design it reveals relatively little (except for team trends) in the way of centrality.

If you can look at a visualization like this and identify its data, dimensionality, levels and structure, and then state that it highlights magnitude and variability but is minimal on expressing centrality…

At that point you understand how to approach this and any other complex data visualization.

Check out the interactive Hockey Complexity data visualization to explore how it works. 

Note: The National Hockey League, NHL its logos and team names are the trade property of the League and respective member teams.