Thursday, March 17, 2011

Google Search Results: Revised

First, a Disclaimer:

As always, not a professional academic or researcher. Also, not a statistician. I'm just a fine art major who happens to really, really like spreadsheets.


Revisions explained

Okay, so I got some helpful feedback yesterday that included improvements to my search methods. So I thought I'd write a quick post updating my data as well as providing some information about traffic and such. I may or may not get to the post about the context of these terms since I went a little crazy making spreadsheets. (Call me a nerd, but I love spreadsheets. They can fix anything.)

I made a major blunder in my original post, I realized. In my searches yesterday, I was only searching www.ign.com and not all of its various gaming subdomains. As such, I revised my search to ign.com and this substantially affected the results. Now ign.com has subdomains for tv, movies, and comics that get much less traffic than its gaming subdomains, but I still searched those subdomains and subtracted those results from the overall total. With the numbers for all of the subdomains and forums included, this wound up altering the final outcome.

I also had some requests to add some additional terms - specifically homophobic or male-gendered slurs. I did wind up adding "fag" to my list, but "dyke" did not make the cut because of the fact that "dyke" is also a name. Specifically, Gordon Van Dyke, who is one of the big figures behind the Battlefield series, skewed the results too heavily. I also did contemplate adding "prick" to the list of terms, but I made an entirely subjective judgement that "prick" is not "as bad" as "cunt". Entirely my opinion, but I'm also trying to keep the list of terms short so I don't go completely insane running them all through Google. For the same reason, I also did not search for racial slurs, since that would cause the list to balloon beyond the point that I can gather data in an hour or two.

Lastly, my problem with Kotaku was that instead of running "site:kotaku.com" through Google, I was using "site:http://www.kotaku.com". That's what I get for not copying and pasting, I suppose.

The Results

So here are the raw results. (No pretty charts today. I like spreadsheets, but charts are a pain in the ass.)



Interestingly, adding Kotaku to the list didn't have any effect on the final outcome. Adding all of IGN's subdomains, however, did. In terms of raw results, IGN now comes out on top with 18 points, just barely edging out Destructoid at 17. Team Liquid's showing isn't quite as impressive, but is still pretty solid at 11.

It's worth noting that Joystiq only scored 3 points, and that Kotaku actually managed to score 0. Something I found almost as interesting is the fact that there are absolutely no results for "feminist/feminazi bitch" on Kotaku.

Now none of this gives us more than a very sketchy general picture without at least having some information about traffic patterns and context. Context we'll save for my next post. As for traffic patterns, I was able to find some super-basic traffic information for Destructoid, Kotaku, IGN, Joystiq, and Team Liquid by using Compete's free traffic search features. (It doesn't let you search subdomains.) The monthly normalized data for February for the five sites is as follows:



Sadly, data about page views is not available for free, so I can't provide that data. But unique visitors and monthly visits will still give us a pretty good picture.

In an attempt to at least half-assedly normalize the raw results, I decided to divide the unique visitors by the number of search results for each term. It doesn't really mean much in terms of where the words are coming from - staff writers? Users? Anonymous commenters? But it at least provides some sort of context as to traffic versus usage of each term. It seems counter-intuitive, but lower numbers are "bad" and higher numbers are "good":



I decided to go through these results and award points again, this time going from lowest to highest. When looking at unique visitors, this time Team Liquid came in first with 20 points, barely edging out Destructoid with 19 points. IGN, by comparison, came in a distant third with a meager 6 points.

When you divide monthly visitors by numbers of search results, results change again - but the overall picture stays the same:



By this metric, Destructoid wins with 20 points, Team Liquid places second with 16 points, and IGN once again comes in third with 6 points.

What does any of this mean?

Well, not a whole lot really. We can make sort of general statements saying that Destructoid and Team Liquid seem to have a higher per capita usage of these terms than other sites, but it's not possible to make any definitive statements about just what any of this means. Another important factor that was not possible for me to examine is the source of the comments. With the exception of Team Liquid, all of these sites employ paid writers, but they also host user blogs. As mentioned before, it's not really possible for me to discern the frequency of use by the writers versus the frequency of use by users or anonymous commenters.

So, overall these numbers aren't that useful from an academic standpoint. However, they provide a useful illustration of the fact that misogynist (as well as other forms of hate speech) language is pervasive across all major gaming sites, and that some sites are consistently more guilty of using this language than others.