Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making

Burnap, Peter and Williams, Matthew Leighton 2015. Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet 7 (2) , pp. 223-242. 10.1002/poi3.85

[img]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (126kB) | Preview

Abstract

The use of “Big Data” in policy and decision making is a current topic of debate. The 2013 murder of Drummer Lee Rigby in Woolwich, London, UK led to an extensive public reaction on social media, providing the opportunity to study the spread of online hate speech (cyber hate) on Twitter. Human annotated Twitter data was collected in the immediate aftermath of Rigby's murder to train and test a supervised machine learning text classifier that distinguishes between hateful and/or antagonistic responses with a focus on race, ethnicity, or religion; and more general responses. Classification features were derived from the content of each tweet, including grammatical dependencies between words to recognize “othering” phrases, incitement to respond with antagonistic action, and claims of well-founded or justified discrimination against social groups. The results of the classifier were optimal using a combination of probabilistic, rule-based, and spatial-based classifiers with a voted ensemble meta-classifier. We demonstrate how the results of the classifier can be robustly utilized in a statistical model used to forecast the likely spread of cyber hate in a sample of Twitter data. The applications to policy and decision making are discussed.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Cardiff Centre for Crime, Law and Justice (CCLJ)
Computer Science & Informatics
Social Sciences (Includes Criminology and Education)
Subjects: H Social Sciences > HM Sociology
Q Science > QA Mathematics > QA76 Computer software
Uncontrolled Keywords: Twitter; hate speech; Internet; policy; machine classification; statistical modeling; cyber hate; ensemble classifier
Publisher: Wiley
ISSN: 1944-2866
Funders: ESRC, Google Data Analytics
Last Modified: 23 Oct 2015 11:32
URI: http://orca-mwe.cf.ac.uk/id/eprint/73158

Citation Data

Cited 11 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Full Text Downloads from ORCA for this publication

Top Downloads of this item by Country

Monthly Full Text Downloads of this item

More statistics for this item...