Just over half a year ago, the Telegraph carried out an analysis appearing to show that ‘the Labour leader’s shadow cabinet d[id]n’t have as wide a reach as their opposite numbers on Twitter’. This conclusion was arrived at by comparing ministers and shadow ministers whose roles were directly parallel: ‘[Jeremy] Corbyn has more followers than Theresa May, while Diane Abbott saw off Amber Rudd, John McDonnell beat Philip Hammond and Keir Starmer edged out David Davis’, but with regard to the others, ‘the Government enjoyed a clean sweep of the board’ (ibid.).

This is interesting, but I don’t find it satisfactory. The Conservative Party’s best known and most popular politicians were mostly in the cabinet. But while Corbyn himself remains the Labour Party’s biggest social media star, its second- and fourth-most popular MPs on Twitter were and are excluded from the shadow cabinet by virtue of not being Corbyn loyalists, while the third-most popular has technically remained a shadow cabinet member but was excluded from the Telegraph’s analysis by virtue of having no Tory opposite number.

So what happens if we look at the public followers of all prospective parliamentary candidates? This happens. (Figures collected in the week before the General Election for a different purpose and re-used here. Small parties excluded. Hat tip to Democracy Club for its crowdsourced list of politicians’ social media accounts.)

# Setup
suppressWarnings(suppressMessages(library(ggplot2)))
suppressWarnings(suppressMessages(library(plyr)))
options(scipen=999)
cd <- readRDS('candidate-data.RDS')

# Create data frame without parties fielding fewer than 40 candidates
parties <- count(cd$party.name)
parties.forty.plus <- cd[cd$party.name %in% parties[parties$freq >= 40, 1],]
parties.forty.plus <- parties.forty.plus[parties.forty.plus$party.name != 'Independent',]

# Visualise as boxplot (using a log transformation with base 10)
ggplot(parties.forty.plus, aes(party.name, log10(followers + 1))) + ggtitle('Public Twitter followers by party') + geom_boxplot() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + scale_y_continuous(name = NULL, breaks = 0:15, minor_breaks = NULL, labels = 10 ^ (0:15) - 1, limits = c(-0.08, 6.05)) + xlab(NULL)

The above is a boxplot. If you’re not familiar with this way of visualising distributions, the most important thing to know is that the thick horizontal lines show the median numbers of followers for candidates of each party while the boxes show the ranges within which the bulk of candidates fell and the dots show the outliers (there’s Corbyn, right at the top with over a million followers).

The boxplot shows that Green, Liberal Democrat, and UKIP candidates tended to have very similar (and relatively low) numbers of public followers, while candidates for the Labour Party, the Conservative Party, and the SNP all tended to have higher numbers. The SNP has the narrowest range and the highest median of any party and the Labour Party has the widest range of any party and a median that’s higher than that for the Greens, the Liberal Democrats, and UKIP, but lower than that for the SNP and the Conservatives.

Did the SNP win the ‘Twitter election’, then? Not necessarily. For one thing, its high outliers all fall within the ordinary range for Labour and Conservative candidates and therefore would not have been outliers had they stood for either of those parties. For another, there’s a factor that may explain the SNP’s high median: there were very few SNP candidates overall (because the party only fielded candidates in Scotland), with a very high proportion of them having had a high public profile due to having sat in the House of Commons already (because the 2015 General Election in Scotland was an SNP landslide).

So let’s look at the effects of party, controlling for whether or not a candidate had previously been an MP and focusing only on Conservative, Labour, and SNP candidates (with Conservatives treated as the baseline):

parties.lab.con.snp <- cd[cd$party.name %in% c('Conservative','Labour','SNP'),]
summary(lm(log10(followers + 1) ~ party.name + was.MP, parties.lab.con.snp))
## 
## Call:
## lm(formula = log10(followers + 1) ~ party.name + was.MP, data = parties.lab.con.snp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8484 -0.2331  0.0736  0.3709  2.0588 
## 
## Coefficients:
##                  Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)       2.64776    0.03852  68.743 < 0.0000000000000002 ***
## party.nameLabour  0.13553    0.04244   3.193              0.00145 ** 
## party.nameSNP     0.19835    0.10309   1.924              0.05461 .  
## was.MPTRUE        1.20069    0.04217  28.475 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6677 on 1080 degrees of freedom
## Multiple R-squared:  0.4436, Adjusted R-squared:  0.442 
## F-statistic:   287 on 3 and 1080 DF,  p-value: < 0.00000000000000022

The above tells us that the party a candidate was standing for and whether the candidate had been an MP in the past both have a statistically significant relationship with the size of the candidate’s pre-election public Twitter following. The relationship between having been an MP and having a large following is stronger and more statistically significant than the relationship between party membership and following size. The SNP may have had a social media advantage over Labour and the Tories that is not explicable purely by reference to the proportion of its candidates who had sat in the Commons before, but because it’s only statistically signicant at a low level (p < 0.1), we can’t rule out the possibility that this may have been a fluke — while Labour appears to have had a smaller but very statistically significant social media advantage over the Tories (p < 0.01). This is the opposite of what we might have expected from the above boxplot. (A good example of why analysis cannot stop with visualisation. Sorry! The actual numbers are important.)

The apparent advantage of Conservative candidates over Labour candidates thus turns out to have been no more than the advantage of sitting candidates over challengers (the Conservatives fielded fewer candidates overall, and had more sitting MPs to begin with). Once we control for that advantage, we find that — other things being equal — Labour candidates actually tended to have more followers than Conservative candidates.

So it seems that — when it comes to its rivalry with the Tories — Labour won the Twitter election after all.1


  1. It still lost the only election that counts, though. There’s no getting around that.