Chapter 11 Preparations for the final assessment

11.1 Seminar

The content of this final seminar is designed to help you prepare for the final assessment.

11.1.1 Petition to Revoke Article 50

On February 14 2019, a petition was created in the petition.parliament.uk website to ask for a revocation of Article 50. The petition did not have much traction at first, until the Primer Minister was forced to request an extension to the date for leaving the EU, and the number of signatures increased exponentially. At 10PM on March 25 2019, the number of signatures was 5,627,362, making it the most succesful petition in UK history.

You are asked to analyse the signatures and provide explanations to what might be causing the large amount of signatures. For that, you are provided with a dataset that merges the signature data taken from the Parliament’s website, along with a dataset from the British Election Study that contains all the results from the 2017 election and census information. All the data is measured at the constituency level.

The dataset can be downloaded here or you can download it directly into R using:

dat <- read.csv("https://raw.githubusercontent.com/QMUL-SPIR/Public_files/master/datasets/A50_petition_2017_results.csv")

You can access the codebook for the 2017 election dataset here. Look for variables of interest in this codebook.

Your key dependent variable is signature_rate, which represents the percentage of signatures per constituency in relation to the number of elegible voters in the same constituency in the 2017 election, i.e. 

\[ signatureRate = \frac{\text{Number of signatures}_{constituency}}{\text{Number of elegible voters in 2017}_{constituency}} \]

Your main independent variable is % of remain votes in the 2016 referendum. This variable is called remainHanretty, because the estimation of results at the constituency level was made by Prof. Chris Hanretty (RHUL). The reference for the data is Hanretty (2017), ‘Areal interpolation and the UK’s referendum on EU membership’

With this data, you should:

  1. Present a hypothesis about the role of the remain vote on predicting the signature rate.

  2. Use plots to present the distribution of these variables.

  3. Fit a bivariate model with the signature_rate variable as dependent variable and remainHanretty as the independent variable. Create a table with the results using screenreg() and discuss the results.

  4. Choose 5 other variables form the dataset that might have a relationship with the signature rate.

  5. Produce a plot for each of these 5 variables according to their level of measurement.

  6. Write down a hypothesis for each one of the variables

  7. Fit a multivariate model using the new variables along with the remainHanretty variable. Create a table comparing both models with screenreg(), and discuss the results.

    1. Make sure you look at the potential change in remainHanretty between models
    2. Pay special attention to any other significant results
    3. Compare the Adjusted R-squared from the models
    4. Check the multivariate model for heteroskedasticity and correct if necessary.
  8. Write down your results as if they were written in a political science or IR journal article.