Context reduces racial bias in hate speech detection algorithms


When it comes to accurately flagging hate speech on social media, context matters, says a new study aimed at reducing errors that could amplify racial bias.

So, perhaps it’s no surprise that social media hate speech detection algorithms, designed to stop the spread of hateful speech, can actually amplify racial bias by blocking inoffensive tweets by black people or other minority group members.

In fact, one previous study showed that AI models were 1.5 times more likely to flag tweets written by African Americans as «offensive» — in other words, a false positive — compared to other tweets.

Why? Because the current automatic detection models miss out on something vital: context. Specifically, hate speech classifiers are oversensitive to group identifiers like «black,» «gay,» or «transgender,» which are only indicators of hate speech when used in some settings.

Now, a team of USC researchers has created a hate speech classifier that is more context-sensitive, and less likely to mistake a post containing a group identifier as hate speech.

To achieve this, the researchers programmed the algorithm to consider two additional factors: the context in which the group identifier is used, and whether specific features of hate speech are also present, such as dehumanizing and insulting language.


Story Source:
Materials provided by University of Southern California. Original written by Caitlin Dawson. Note: Content may be edited for style and length.


Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *