Reddit vs. Digg - Don't Call It an Infographic

An infographic put together by RateRush has been making the rounds today, getting Liked on Facebook and Retweeted on Twitter. The data points they have collected and released are useful, but I bristle when they call it an infographic. It’s really applying so much artistic chartjunk to a nominal amount of data. The graphics do very little, if anything to help interpret the data. Let’s examine:

Reddit vs. Digg infographic

In the first graphic, we are shown a comparison of Reddit vs Digg links submitted, but the two bar charts are shown with data at different scales. So while the #1 submitters on each site have bars at full height, Reddit’s #1 submitter have had only about 1/3 as many popular links as Digg’s #1 (8 to 23). It also says nothing about what percentage of that user’s links became popular. Did the Reddit user have to submit 30 links to get 8 popular ones? Or is it a function of their place in the community, where they have enough other users following them that most of their links become popular?

Top 10 Domains Hitting the Homepage

This is a basic pie chart with the saturation boosted. The problem is that 100% of the pie is dedicated to only the Top 10 domains, and does not show any percentage for Other.

For example, the Reddit chart makes it look like more than half of all front-page links are to imgur.com, but what the chart really says is imgur.com makes up half of all links to hit the front page that are in that list of 10 domains. There’s a difference.

What if these top 10 represent only 20% of front page links? That’s a potentially significant shift between homogeneity and heterogeneity that isn’t represented. There should be another pie slice for Other Domains.

Data points here are set on a clock. But wait, there are only 12 hours in an analog clock, so they have to put two separate data points, one for AM and one for PM, next to each other. Does that mean they’re trying to compare the differences between these 12 hours? I have no clue. A very simple line chart plotting these numbers on a 24-hour length x-axis would tell much more about hour-to-hour fluctuations and provide a better visual comparison between the 2 sites. Here’s something I threw together using Google Charts:

Number of New Front Page Links By Hour

Interestingly, it looks like both services experience a spike in new front page content at 9am, something that was not revealed by the clock.

This is a table of numbers that does nothing to, for example, highlight which are the high and low days for each site, or where there is typically a huge drop-off. Digg’s numbers between Thursday and Friday drop by over 20% while Reddit’s drop by only 8%.

Number of New Front Page Links by Weekday

Top 10 Most Common Words Appearing in Titles

What does breaking each list into 2 columns and squeezing them into a chalkboard graphic add to the visualization of this data? My takeaways from this are that Redditors are obsessed with themselves (#1 overall term), and Diggers are fond of sex and “infographics” (#3 & #5 terms, even if by infographic they mean “pretty pictures with a few data points”).

But really, this simply highlights the fact that Reddit supports self threads where the community has a discussion hosted on reddit.com itself, and so Reddit will often appear in the title. I don’t think that’s the way Digg works.

This infographic is comprised of single numbers placed on the sites’ respective logos. That’s it. But the definition of each data point includes both sites (a link first appears on X site, then later appears on Y), so having only one logo doesn’t convey the idea behind the data. You could also do something like this and be clearer, in plain text:

  • Digg → Reddit: 6
  • Reddit → Digg: 22

Even better would be to dive into which links made it onto the other site to see if we can draw some conclusions from those specific instances. At this point the data serve only as fodder for Reddit pride. (“Digg is copying our links!”)

Summary

This graphic may be nice to print out and tack on the wall of your cubicle — it’s eyecatching and contains some useful data points. But as an infographic I don’t think it does anything to aid in the interpretation or understanding of the data it presents.

Also, if you didn’t take the time to read the fine print at the top of the graphic — and you likely didn’t with all the eye candy shouting at you below — you wouldn’t have noticed the two different methodologies for determining popular links. On Digg, they gathered links that hit the front page. On Reddit (I believe due to personalization factors and subreddits), they took links that reached 100 upvotes. This may be the reason why in charts like Number of New Front Page Links by Weekday Digg appears to get more than twice as many links on the front page as Reddit.