The authoritative source for independent research on UK-EU relations

17 Nov 2015

Politics and Society

Working towards sentiment analysis

In our work investigating how people discuss the EU within Twitter one of our aims is to determine sentiment, both pro and anti the EU, and in relation to the referendum on UK membership of the EU – pro-remain or pro-leave.

The approach we have taken initially is very straightforward. If tweeters use hashtags associated with the leave camp (including; #brexit, #no2eu, #notoeu,#betteroffout, #voteout, #eureform, #britainout, #leaveeu, #voteleave,#beleave, #loveeuropeleaveeu, #leaveeu) then we judge their sentiment to be pro-leave. If they use hashtags associated with the remain camp (#yes2eu,#yestoeu, #betteroffin, #votein, #ukineu, #bremain, #strongerin,#leadnotleave, #voteremain) then their sentiment was pro-remain.

If we do this we get the following results:

Sentiment scores 9th Aug - 27th Oct

And we can see that these results are fairly consistent day by day:

I'm working on the day by day sentiment in our EU dataset

But they are not consistent with polling data – such as the ICM tracker below which shows remain with the higher score. So why is that?Lots of claims about EU referendum polling - here's ICM tracker which suggests "in" still have decent lead

First – we need to remember that research using Twitter data can only ever tell us what people who use Twitter think and does not necessarily reflect the population at large. Interestingly this has been a problem for pollsters trying to include this data in their election prediction polls.


Secondly, we also need to remember that, as we discussed in a previous post, people tend to tweet against things rather than for them.
How (not) to predict elections.html

Thirdly, we are currently restricting our analysis to sentiment associated with hashtags. This means that if the tweet doesn’t have a hashtag we are not analysing it’s sentiment.

If we take a look at the data we can see some differences. There is a difference in style between the way that tweets and in particular hashtags are used between both camps. The leave camps tend to use many hashtags – see below for examples from both LEAVE.EU and Vote Leave – especially LEAVE.EU.

LEAVE.EU (LeaveEUOfficial) Twitter: Congratulations to Portugal’s new anti-austerity government – Brussels Beware  #EUref #Brexit #Austerity

Leaving the EU 'won't undermine the UK as a financial centre' - Axel Weber, Chairman of UBS #euref #voteleave

LEAVE.EU use hashtags in their twitter bio – this may encourage followers to use them.

LEAVE.EU (LeaveEUOfficial) Twitter : The latest Tweets from LEAVE.EU (@LeaveEUOfficial). A cross party & non-political campaign, advocating the vote to leave the EU in the upcoming referendum. #LeaveEU #Brexit

The remain camp do not tend to use as many hashtags in their posts – for example:

The Leave Campaign are losing the arguments + threatening to get ‘nasty’. Let's stop them:

Tweets that would be considered pro-remain often also include hashtags we have classified as pro-leave. This could be for several factors such as positioning themselves within the debate by using a popular hashtag or trying to talk to those which hold opposite views.

Richard Corbett @RCorbettMEP Twitter: U think being part of EU holds back trade with rest of world? Think again: … #StrongerIn #Brexit #EUref@euromove

We can see this by looking at the data. When we look at hashtags that are used in conjunction with #strongerin and #brexit (graphs below). We find, overall, a much lower use of #strongerin and we find that it is used with leave hashtags especially #brexit, #leaveEU, #voteLeave. Where as #brexit is used not used with remain hashtags at all but with other leave hashtags.

hashtags associated with #strongerin in our data. Much lower use #brexit and often used in with leave hashtags

first hashtag graph

terms associated with #brexit in our dataset

second hashtag graph

In the future we aim to do a more sophisticated form of sentiment analysis where we analyse the text with the tweets. This leads on to a final problem that is often discussed in association with sentiment analysis of text, and one that we also see here, which is identifying the target of the sentiment. The target is the item that the sentiment is expressed towards.

In the tweet by Richard Corbett MEP above the ‘leave’ sentiment is expressed towards the ‘U think being part of EU holds back trade with rest of world? ‘ and the ‘remain’ sentiment is expressed within ‘Think again’. Automatically identifying the text that the target is associated with is not always easy. This is not an easy issue to tackle and we have’t even begun to discuss jokes and sarcasm yet!

So you might be tempted to think that if this data is not representative of the general public why are we looking at it? What we can look at is how this data changes. If we can identify the differences and track the relationships between Twitter data and a more general public opinion we can start to hypothesise about how changes in the Twitter data equate to public opinion more widely. We’ll talk about this a lot more in the future.

Our project is part of the Economic and Social Research Council’s The UK in a Changing Europe programme. Look out for our regular updates as the project tracks developments in the debate on the UK’s continued membership of the EU and follow us @myimageoftheEU


Out of the hut into the fire?

How politicians learn about public opinion

Labour conference – the party’s biggest challenges are yet to come

Mr Yousaf’s conference challenge

What does Poland’s parliamentary election mean for the EU?

Recent Articles