Local level Brexit vulnerabilities in Scotland: Brexit Vulnerabilities Index (BVI)

This research identifies areas of Scotland that are expected to be most vulnerable to the consequences of Brexit, and what drives those risks to support local authorities and other organisations in understanding local risks around EU exit.

Annex 2 Technical Annex

The Brexit Vulnerability Index (BVI)

The Brexit Vulnerability Index is an index which combines data from 8 indicators. Each indicator has been chosen as it provides quantifiable evidence likely to provide either a direct or indirect measure of issues relating the UK leaving the EU. This suite of indicators provides data for each of the 6,976 datazones in Scotland. Datazones are small area geographical units used for statistical measurement with a population of around 770 people.

Eight variables are used to calculate the BVI:

  • Access to Services;
  • Working age population;
  • Income deprivation ranking;
  • Population Change;
  • Workers in Brexit sensitive industries;
  • EC Payments received through
    • CAP and
    • ESF and ERDF; and
  • EU Worker Migration.

The rationale for selecting each of the specific variables is discussed in Chapter 3 of the report.

Each variable is first standardised by ranking the values. This is necessary because the variables are measured on different scales and by ranking the variable it is ensured that they have identical distributions with the same range and therefore maximum and minimum values.

However, using the ranks alone would result in distributions which are symmetrical, and one variable indicating vulnerability could be fully 'cancelled out' by lack of vulnerability in another. This does not reflect the prior distribution of the variables and gives undue weight to the least vulnerable scores.

Simply using the symmetrical ranks is inappropriate given that high ranks signify less vulnerability and do not imply a lack of vulnerable to Brexit. A transformation is required to address these issues and the exponential transformation of the ranks was chosen as the most appropriate method. This is in line with the methodology used by the Scottish Index of Multiple Deprivation (SIMD).[27]

The exponential transformation deals with this question of variables cancelling each other out. It has the advantage that every variable is converted to an identical distribution with the same maximum and minimum values, whilst emphasising the most vulnerable 'tail' of the distribution. The transformation 'draws out' the ranks of the most vulnerable datazones so that spaces are introduced between datazones that reflect the actual distributions.

The formula for the calculation is:

X = -23*log{1-R*[1-exp(-100/23)]}

where R is the rank (for the exponential transformation the least vulnerable datazone is ranked 1 and the most vulnerable datazone is ranked 6,976) transformed to the range [0,1], log is the natural logarithm and exp the exponential transformation.

The constant -23 gives a 10% cancellation property. To illustrate why this property is desirable, suppose two variables were equally weighted and a cancellation factor was not applied. A datazone which was most vulnerable on one of the variables and least vulnerable on the other would be ranked at the 50th percentile. However, it does not seem appropriate to suggest that lack of vulnerability in one variable should exactly cancel out an entirely different dimension of vulnerability in another. Using the 10% cancellation property, the datazone would be ranked within the 10% most vulnerable datazones. This was considered to be more appropriate.

Following the exponential transformation, the datazones have scores ranging between 0 (least vulnerable) and 100 (most vulnerable) on each variable. In addition, the scores increase exponentially so that the most vulnerable datazones have more prominence. The 10% cancellation factor means that the most vulnerable 10% of datazones are emphasised with scores between 50 and 100 whilst the remaining 90% of datazones have scores between 0 and 50. Thus the exponential transformation successfully deals with the issues of cancellation and symmetry.

Weights are applied based on the relative importance of each variable as discussed in Chapter 3 and based on data quality and potential correlations. The standard weight was determined at 20 or 12%, thus around one eights of the overall index.

  • Whilst none of the variables are highly correlated, (defined as having a Pearson Correlation Coefficient above 0.69), the variable for CAP payments is moderately correlated with the variables measuring the Brexit Workforce and Access to Services. Thus, despite good data quality, it was decided to weight CAP down relative to the remaining variables. The weight was set at 10 or 6%.
  • It was further decided to weight down ESF and ERDF due to poorer data quality. This is because payments are allocated to local authorities. In order to disaggregate data, it was decided to distribute the local authorities' payments to datazones using population weights. Therefore, ESF/ERDF payments by datazones are only an estimate and not as accurate as the remaining variables. ESF and ERDF are weighted with factor 5 and account for 3% of the overall BVI.
  • Income deprivation data is both of high quality and also relatively more important than the remaining variables (see Chapter 3). Thus, the variable counts double with 24% and a weight of 40.
  • Access to Services, Population Decline and the Share of the Working Age Population are given the standard weight of 20 and each account for 12% of the BVI. This is because of high data quality, weak correlation coefficients and relative importance. However, because Access to Services is relatively more relevant to rural areas, it was decided not to weight this variable up. Furthermore, it was decided that because Population Decline and the Share of the Working Age Population measure similar aspects – despite being only weakly correlated – these should not be weighted up further.
  • EU National Workers, as measured by National Insurance Registrations, are highly relevant, but because data is only available for intermediate zone and had to be allocated to datazones using population weights, the variable remains at the standard weight of 20 or 12% of the BVI.
  • The share of the Workforce in Brexit-Sensitive Industries is highly relevant (see Chapter 3) and data quality is high. However, the indicator is weighted up, but not as highly as Income Deprivation. Thus, it accounts for 18% of the BVI and was given the factor 30.

The overall BVI score is then constructed by combining the exponentially transformed and weighted variables. The larger the BVI score the more vulnerable the datazone. However, in order to compare datazones it is important to use the relative order of the ranks. It is not correct for example to say that datazone X is twice as deprived as datazone Y because the BVI for X is 50 and that for Y is 25. This is due to the transformation of the data that takes place to enable a variable score to be produced. It is equally not true to say that a datazone of rank 100 is twice as vulnerable as a datazone with rank 50. However, a datazone of rank 75 is more vulnerable than a datazone of rank 125.


Email: ruralstatistics@gov.scot

Back to top