Interactive Correlation Matrix with D3

Published: by

Recently I decided that if I wanted to improve my data visualization skills, I should become more familiar with D3.js. D3 is a JavaScript library that allows you to manipulate documents by associating elements with some data that you provide. It's not exactly a visualization library, like Highcharts or plotly. It doesn't have built-in chart types. Instead, it allows you to connect existing web standards like HTML, SVG, and CSS directly to your data. This makes D3 extremely powerful and flexible, but also difficult to learn.

Hearing about the steep learning curve scared me away from D3 at first, but ultimately I'm giving it a shot. After all, I'm not a total newcomer to JavaScript, and I like to think I can pick these types of things up pretty quickly, so maybe it won't be so bad.

Here is my first attempt at making something useful: a correlation matrix where clicking on cell draws the scatterplot between variables and . I borrowed heavily from Karl Broman's example. I found it very helpful to use that as a reference, since I'm not yet comfortable enough to make anything moderately involved on my own.

I got the data from a recent Kaggle competition on housing prices. I did a small amount of cleaning and removed all the categorical variables for the purposes of this illustration.

The size of the circles represent the strength of the correlation between each pair of variables. Blue indicates a positive correlation, red indicates a negative correlation. Hover over a cell to see the two variables and their correlation coefficient. Click on a cell to draw the scatterplot.

You can see the code here. It's also hosted on GitHub.