All the major party manifestos are now out, and we can see their plans are if they win the election: scrapping free schools meals, maybe, or renewing / not renewing Trident. But whatever else the parties say, or plan to do, we also know that probably the single most important issue in this election is Brexit.
Here, a tension has been evident since last summer: although overall the country voted narrowly in favour of leaving the EU, the majority of MPs were personally against this (including, most notably, Theresa May). The Conservative Party has been working hard to deal with this, and is reportedly vetting new candidates standing in constituencies across the country to make sure they’re confirmed Brexiters (and not, say, soft Leavers – or worse, outright Remainers).
But again, this poses a problem, as we don’t actually know how the national vote to Leave the EU broke down at constituency level – and so we don’t know where there might be mismatches between a Leave-voting constituency and a Remain-voting MP (or vice versa).
Six days after last year’s referendum, I sat down to try and work out how Westminster constituencies had voted. This turned out to be quite tricky, because the
referendum result was counted by local authority areas – not constituency. The two only roughly overlap, and so I had to make some adjustments to get the calculations to work out.
I’ve recently had the final set of estimates (and the methodology for producing them) accepted for publication in an academic journal, and so I thought I would describe the steps I’ve taken to work out reasonable answers to what seems like a simple question.
Since the referendum I’ve published three sets of estimates:
- an initial set of estimatesproduced the week after the referendum;
- a second set of estimates produced in August, and upon which basis I initially submitted my article;
- a third set of estimates, which resulted from changes suggested during the review process
For the first set of estimates I built a statistical model which explained the Leave share of the vote in each local authority area using demographic characteristics. I then used that model to extrapolate from the demographic characteristics of Westminster constituencies.
The problem with this first set of estimates was that some were wrong. If a constituency overlapped perfectly with a local authority, this method wasn’t guaranteed to produce the (known) local authority results .
I fixed this problem with the second set of estimates. Here I built a statistical model which “explained” the number of Leave and Remain voters in each local authority area. I used that model to extrapolate from the demographic characteristics of groups of Census Output Areas. I then divided or multiplied these extrapolations as appropriate to make sure that they added up to the local authority totals, before adding these scaled extrapolations up to Westminster constituencies.
The problem with this second set of estimates was that I had not accounted for some relationships between demographic variables. For example, we know – roughly – that the referendum vote was highly correlated with education levels, as areas with greater proportions of people with university degrees tended to vote Remain. But high levels of graduate qualifications mean something different in older constituencies compared to younger constituencies: older people had fewer chances to go to university, because participation in tertiary education was much lower when they were growing up. If, despite this, an older constituency has lots of graduates, this may matter.
I fixed this problem with the third set of estimates, which included an interaction term between age and the proportion of the population with higher educational qualifications.
Fortunately, this last change (which emerged during the review process) did not affect the overall story. These estimates are all very similar. The graph below shows, in the lower left, the pairwise scatter-plots of the different sets of estimates, and in the upper right, the correlation between the sets of estimates. The correlation between the second and third sets of estimates is very high indeed.
I know from some ward-level results that these estimates are reasonably good – certainly better than just using the figure for the closest-matching local authority.
That said, these figures are not perfect. There are always errors. It’s because of errors like these that I have on occasion felt awkward about the way in which my estimates have been used to criticise named MPs for ignoring the will of their constituents — particularly when these MPs are called traitors or enemies of democracy.
If you want to use these estimates, I’d ask that you do two things.
First, naturally, I’d really appreciate it if you could cite my article:
“Areal interpolation and the UK’s referendum on EU membership”, in the Journal Of Elections, Public Opinion And Parties.
Second, I’d like you to say “probably” before you talk about how a constituency voted, unless I’ve flagged up a result as being known exactly. I don’t have confidence intervals for my estimates: there’s no clear statistical theory underpinning the scaling step. You can look at the empirical distribution of errors from councils which have declared results. Or you can just say, “probably”.
We don’t know what impact the referendum will have on voting behaviour in the referendum. Maybe it will have no impact, and the average Leave seat will see the same change in the vote for the main parties as the average Remain seat. Or maybe the referendum outcomes will provide the basis for new patterns of electoral competition.
If it’s the latter, then MPs elected to the next Parliament may realise that their constituents continue to have views on Brexit, and that these views, far from being confined to a one-off referendum, shape their voting behaviour. That could make the the 2017 Parliament very interesting indeed.
By Dr Chris Hanretty, Reader in Politics at the university of East Anglia.