Six Colors
Six Colors

by Jason Snell & Dan Moren

This Week's Sponsor

20 Years of Great Audio Software from Rogue Amoeba

By Jason Snell

Fun With Charts: A 2021 Report Card breakdown

The Six Colors Report Card for 2021 is in the books, but nerds being nerds, there’s always a clamor for more statistical slicing and dicing of the data.

This year I’m happy to present a few charts from Six Colors member, Duke University professor, and data-visualization expert Kieran Healy that take the initial Report Card scores and slice them in a few interesting ways. (The last one might break your brain. You’ve been warned.)

First up is a chart that drills down into the vote distributions across all the categories, so you can see which categories gathered a variety of votes and which ones were a bit more consistent across all 53 voters.

Answer Distribution for Each Question, Chart

Next is a chart comparing those distributions to the overall distribution (shown in gray). This one lets you compare how people think Apple is doing on each particular topic with their sense of how Apple is doing overall.

Answer Distribution for Each Question, Chart 2

Here’s a plot of all the participants in the survey, ranked from most positive to least positive. I’ve removed the names from this because I didn’t clear such detailed analysis with the panelists in advance, so you’ll never know who our sunniest panelists were (the fifteenth and first to submit), nor the identities of our grouches. (I may need to send flowers to twenty-eight.)

Answer Distribution for Each Respondent, Chart

Finally, here’s the wackiest of all the plots. I’ll let Kieran explain it:

This is a Principal Components Analysis. The idea here is that you have a bunch of respondents and a bunch of dimensions (the 14 questions) that they are giving you information about. You can’t really visualize things in 14 dimensions. So you look for a way to reduce the dimensionality, to pack the data into a smaller number of dimensions while losing as little as possible of the information it contains, by some metric or other. The magic of linear algebra provides various methods for doing this. Plus, you can sort of think of the rows (the respondents) and the columns (the questions) as being dual to one another. If you sort of compactify the 14-D space of answers to questions down to two dimensions, you can show where your respondents are in that space, and you can show where the original questions are too. It’s all about eigenvectors, maaaan!

You’ve got the questions (represented by arrows) and the respondents (represented by their numbers). And the plot’s dimensions are the first two eigenvectors of this decomposition—two orthogonal dimensions that account for the most variance. For the arrows, the fact that they’re all pointing to the left to varying degrees means that there are no questions that really sharply divide respondents. Like, you have no question where the distribution of answers looks truly like the opposite of the distribution of answers to the “How’s the Mac doing” question. It’s more about degrees of agreement.

So the first dimension, the x-axis, is mostly capturing “How high a score got awarded”, with high scores to the left and low scores to the right. You can see this with respondent 28 again, way, way, way off in the bottom right hand corner—the most negative pattern of responses. And happy respondents 15 and 38, very positive, over on the left. Meanwhile, the y-axis is capturing the way the answers to the questions fan out.

The Mac question is headed straight west, the one there’s most consensus on. Then (maybe—PCAs are very inductive) there’s kind of “Questions that are hard to answer, because there’s some uncertainty about them, or disagreement in the respondent pool”, to the northwest, like Services and Environmental/Society. And “Questions where many people have somewhat negative opinions,” like HomeKit, and Apple TV.

That’s kind of speculative. But that’s what the PCA is trying to do—inviting you to interpret the main dimensions it finds in some sort of substantively meaningful way. They’re fun for that. And plus you get this sense of where the individual respondents fall in this imaginary space.

Finally, to the degree that arrows are sitting very close or right on top of one another, that means they are on the same path in that vector space, i.e. the answers to them mostly contain the same sort of information. So it’s not surprising that e.g. the wearables vector and the watch vector are right on top of one another, because in effect—in the minds of your respondents—that’s the same question.

Answer Distribution for Each Respondent, Chart

I hope that last chart blew your mind. I’m still finding pieces of mine scattered around the floor in my office.

If you appreciate articles like this one, support us by becoming a Six Colors subscriber. Subscribers get access to an exclusive podcast, members-only stories, and a special community.


Search Six Colors