AddThis SmartLayers

Joint investigation reveals scale of abuse faced by MPs

Pete Sherlock 2022A joint investigation by regional newsrooms across the country has found more than 3,000 offensive tweets are sent to MPs every day.

Journalists from the BBC, ITV, Reach plc, Newsquest, National WorldIliffe Media and Birmingham City University analysed three million tweets directed at MPs over the course of six weeks between March and April this year.

The investigation, led by the BBC’s Shared Data Unit, found one in 20 tweets were deemed to be “toxic2 and female MPs were marginally more likely to receive abuse than their male counterparts.

The posts were analysed by a machine-learning tool built to identify harmful conversations online.

It is one of the first times artificial intelligence (AI) has been used for a single journalism project by dozens of UK regional newsrooms.

Pete Sherlock, pictured, assistant editor at the BBC Shared Data Unit, said: “This investigation shows how much can be achieved through collaborative journalism.

“By bringing a wide range of journalists from across the UK together alongside data scientists and academics, we were able to produce content we would not have been able to do as individuals.”

On 28 April this year, the Shared Data Unit held a ‘Hack Day’, inviting 25 journalists and data analysts from the participating organisations to test the dataset.

Delegates from ITV, Reach, Newsquest, National World, Birmingham City University and Iliffe Media joined together with the aim of opening up that dataset in groups, finding out what sort of things people were saying in tweets mentioning MPs and discussing how they could define the toxic proportion of it.

They were also joined on the day by a series of guests including Mike Wendling, author and former editor of the BBC’s disinformation unit, and Jess Phillips MP.

Paul Bradshaw, professor of journalism at Birmingham City University and BBC Shared Data Unit, said: “The shared data unit team created a number of ‘recipes’ so that journalists from local newspapers and broadcasters could use a range of AI technologies to find stories in almost 3 million tweets mentioning MPs.

“These included machine learning – which makes it possible to ‘learn’ how likely, for example, a sentence is toxic – and natural language processing – which makes it possible to extract the adjectives, names or nouns from a passage of text.

“The journalists also used a form of machine learning called unsupervised learning to group tweets into common themes, making it easier for reporters to find the ‘needle in a haystack.”

Paul Lynch, journalist with the Shared Data Unit who led on the project, added: “There were quite a few Eureka moments – but something that got me was the realisation certain groups of words occurred more frequently when referring to certain MPs.

“None of this project would have been possible if not for that collaborative day at the start of it all – and we are continuing to work with the delegates who are making their own local or regional versions of the story, analysing the abuse against the MPs in their area.”