One of the challenges faced by contemporary organizations is making sense of large amounts of complex data. Visualization methods – plots, graphs, animations, and interactive displays – can be one of the best ways to make data understandable.
Consider one example: The ABA recently released its employment data for spring 2014. Those data report employment numbers for graduates of nearly 200 U.S. law schools, using a host of complex categories (descriptions of which can be found here). Visualization is key to understanding the broader picture that the data paint.
One useful tool for this purpose is a heatmap, a tool widely used in genomics and other areas where large, complex data are common. It is essentially a color-coded representation of the numeric values in the data, plotted for each observation (school) and variable (employment outcome), and ordered according to some other informative characteristic. For the heat map here, we looked at the schools in the top 50 of the U.S. News 2014 rankings, and plotted the percentages of each school’s graduates in each of the ABA’s summary outcome categories. Higher values are indicated by blue, and lower values by orange or red (with grey in the middle). We also included a “dendrogram” at the top; this is a visual representation of how similar each of the categories are to each other, based on the distributions of their values across the different schools.
The key to using a heatmap is not to focus on individual cells, but instead to look at broad patterns. For example, the “Bar Passage Required” category shows a large block of blue (high values) among the 15-20 highest-ranked schools; below that, the numbers are much more variable. The “JD Advantage” category is almost the opposite, with many low values (reds) among the highest-ranked schools, and higher values (more blue) for schools in the 25-50 range. The same is true for the “Unemployed – Seeking Work” and “Non-Professional” columns, which generally increase in numbers (turn from blue to red) as we move down the rankings, albeit with some variation. Other categories are more random, but show some interesting outliers.
The dendrogram at the top illustrates how similar or different each of the columns is to each other by “linking” the most similar two columns together, then the third, etc. (for the intuition, just think that two identically-colored columns would be “linked” first, and two opposite-colored columns last). The important intuition here is that “Bar Passage Required” positions are very unlike all the rest: Schools with a high percentage of graduates in that category have low percentages in all the others, and vice-versa. In contrast, most of the other outcomes are highly similar, with the “JD Advantage” and “Unemployed – Seeking Work” outcomes the next most distinctive.
A PDF of the same heatmap for all 195 schools is here; note that the color mappings are not identical across the two plots.