Six Colors
Six Colors

Apple, technology, and other stuff

Support this Site

Become a Six Colors member to read exclusive posts, get our weekly podcast, join our community, and more!

By Kieran Healy

A second look at the 2024 Six Colors Report Card

I’m a Six Colors Subscriber and someone who likes to draw pictures of data, and Jason Snell kindly asked me if I wanted to have a crack and drawing some additional graphs based on the 2024 Report Card.

The figures Jason makes of his Report Card data are already very good, so there is no point in remaking them just for the sake of it. Instead, I’ll show a few additional views that try to bring out some of the structure in the data.

What kind of structure might we be interested in? We can think of three broad areas: the ordering and distribution of the answers; patterns of non-response to the questions; and the relational structure of answers by question and respondent. Let’s look at each one in turn.

Ordering and Distribution

Jason presents a view of the average scores for each of the questions, ordering the questions starting with the Mac and the iPhone on down to Developer relations and World impact. When presenting averages by category like this, it’s often useful to order the categories by the average. We can also get a quick sense of how things compare by scaling and centering the scores before plotting them. We calculate the grand mean for all questions together and subtract that number from each respondent’s answer. Then we calculate the standard deviation of the centered scores and divide each score by that number. Now we calculate the average scaled score for each question, and plot that from highest to lowest. Now can see how far away the average for each question is from the overall mean, in standard deviations.

Scaled Means by Category

The Mac and Hardware categories do substantially better than average; Vision Pro and Developer Relations substantially worse. If we hadn’t bothered to center and scale the data, but just plotted the means and ordered them, we’d see much the same pattern.

Scaling, centering, and calculating the means does necessarily summarize the data. The Report Card survey is just small enough that we should be able to get a good sense of the entire dataset, though. On the answers side, we can draw a bar chart of the distribution of answers for each question, and organize it into a faceted or “small-multiple” plot. This is a powerful way to see the overall distribition of answers directly.

Distribution of Answers to Questions

We also order the facets from highest average score (top left) to lowest (bottom right). This lets us see that lower-scoring categories tend to show less consensus amongst respondents than higher-scoring ones. While Hardware Reliability gets almost uniformly 4s and 5s, it’s not the case that Developer Relations uniformly gets a parallel 2s and 1s. Rather, answers are a bit more spread out.

Patterns of Non-Response

When thinking about why different questions might show different scoring patterns, we naturally ask what is it about the category or topic that would produce a different pattern of answers. The most basic reason is easy to overlook. Respondents are not requied to answer all the questions. Some people might choose not to answer a question because they have no opinion, or because the topic is irrelevant to them. So even in a small survey like this we can in effect have different sub-populations of respondents. Let’s look at which questions are most likely not to be answered.

Non-Response Patterns

Everyone has an opinion on Hardware, OS Quality, and the iPhone. But a fifth of respondents have no opinion on the Vision Pro. Almost the same number skipped the TV, home, and societal impact questions. Twenty two of the fifty nine respondents have no views on Developer Relations.

Relational Patterns

A survey like this is a table of rows and columns. The rows are the respondents and the columns are the questions. Each cell is a particular respondent’s score for a particular question (which may be missing). We can do a surprising amount with data of this sort. In this case, we can shuffle around both the rows and the columns in a systematic way until both the respondents and the questions are as similar as we can reasonably make them. There are many, many approaches to clustering data in this way. One of my favorite ways to do it for data of this size is to make a “Bertin Plot” of the responses. Named after the French geographer Jacques Bertin and developed in the 1960s, plots like this involve permuting or “seriating” the rows and columns. Bertin’s group originally did this by hand using a matrix of lego-like blocks that could be skewered in the rows and columns.

Bertin's Physical Matrix

There is a highly entertaining, and very French video from the early 1970s that motivates and demonstrates the method. Because we have the Internet on computers now, we can pick a method for permuting our table and apply it directly. While much faster, sadly there are no longer any skewers involved. Here is what we get when we apply the method to the Report Card data.

The nice thing about this representation is that by filling in only the “good” scores (4s and 5s), but still showing the “bad” ones (3s and lower), we get a very good sense of how both the questions and groups of respondents hang together. This includes patterns of non-response, too. Things don’t cluster perfectly, of course. This is a heuristic to aid interpretation rather than a law of nature. But it’s still a very a useful way to get an immediate sense of the entire dataset at a glance.

[Kieran Healy is a Professor of Sociology at Duke University. He also works on techniques and methods for data visualization.]

If you appreciate articles like this one, support us by becoming a Six Colors subscriber. Subscribers get access to an exclusive podcast, members-only stories, and a special community.


Search Six Colors