If you’ve spent any time at all on Twitter, you know that it can be a great place for a variety of things – real-time news, celeb-watching, comedy, and the list goes on and on. But you also know that Twitter is full of the kind of homophobic and racist language that can make you physically recoil. Now, a group of researchers have developed an interactive map of all the hate speech that Americans are pumping out on a daily basis.
The map was created by geography students at California’s Humboldt State University, the same group of people who brought us the post-election Twitter racism map back in November. Back then, they looked at racist tweets the focused on President Obama’s reelection and found that Mississippi and Alabama were the two hotbeds for such activity.
“Rather than focusing just on hate directed towards a single individual at a single point in time, we wanted to analyze a broader swath of discriminatory speech in social media, including the usage of racist, homophobic and ableist slurs,” say the researchers.
For instance, here’s the map of generally “homophobic” tweets, which are determined by the use of words like “dyke,” “fag,” “homo,” and “queer.”
And here’s the map of racist tweets – those containing the words “nigger,” “chink,” “wetback,” “gook,” or “spick”:
Of course, analysis like this is never going to be 100% accurate. Keyword analysis has inherent issues. For instance, the word “queer” is not always used in a derogatory, hate-filled manner. People could be tweeting out the word “fag” in another context, such as bemoaning its usage.
On the other hand, it’s hard to justify many used of words like “wetback” on Twitter. Sure, it’s not completely solid analysis, but it’s pretty close. You have to to imagine that the majority of people tweeting about fags, dykes, niggers, and chinks are doing so in a hateful manner.
But to completely cut out this sort of uncertainty, the researchers manually read and coded each tweets to judge the sentiment, “in order to address one of the earlier criticisms of our map of racism directed at Obama.” This way, they could know, for sure, whether a tweet that contained the word “queer” was actually posted in a hateful context.
Using DOLLY to search for all geotagged tweets in North America between June 2012 and April 2013, we discovered 41,306 tweets containing the word ‘nigger’, 95,123 referenced ‘homo’, among other terms. In order to address one of the earlier criticisms of our map of racism directed at Obama, students at Humboldt State manually read and coded the sentiment of each tweet to determine if the given word was used in a positive, negative or neutral manner. This allowed us to avoid using any algorithmic sentiment analysis or natural language processing, as many algorithms would have simply classified a tweet as ‘negative’ when the word was used in a neutral or positive way. For example the phrase ‘dyke’, while often negative when referring to an individual person, was also used in positive ways (e.g. “dykes on bikes #SFPride”). The students were able to discern which were negative, neutral, or positive. Only those tweets used in an explicitly negative way are included in the map.
You can check out the full interactive map here, where you can zoom in to see specific concentrations of twitter hate speech.
[Floating Sheep via MIT Technology Review]