Scottish housing market: tax revenue forecasting models – review

Findings of an independent literature review of tax revenue forecasting models for the housing market.


1. See recommendation 3.32 of the Scottish Fiscal Commission's Report on Draft Budget 2016-17, available at

2. The Scottish Government's forecasting approach has evolved over time. For example, the modelling approach used for Draft Budget 2016-17 is described here:, while the most recent modelling approach, used for Draft Budget 2017-18, is described here:

3. Scottish Government data relating to housing supply (starts and completions) is available on a quarterly frequency by local authority back to 1996, and at a national level to 1980 (and to 1920 for annual completions). Registers of Scotland quarterly statistics on prices and transactions go back to 2003, and annual data to 1993. The ONS House Price Index is available on a monthly basis back to 2004 for Scotland, using the new methodology covering both mortgage and cash sales. ONS house price data under the previous methodology, which is limited to sales with a mortgage, is available for Scotland back to the early 1990s on quarterly basis and to 1969 on an annual basis. The ONS Experimental Index of Private Housing Rental Prices has monthly data for Scotland back to 2011, while the Scottish Government Private Rent Statistics has annual data by broad rental market area and property size back to 2010. Quarterly data on mortgage affordability in Scotland is available back to the early 1990s from the ONS and to the 1970s from the Council of Mortgage Lenders, although there are breaks in methodology and coverage. Quarterly National Accounts Scotland, which include residential dwellings under gross fixed capital formation on an experimental basis, are available back to 1998.

4. We excluded several models from the analysis. We did not evaluate state-space models, because they can generally be represented as the univariate and multivariate formulations we cover. We did not cover non-linear models in depth beyond GARCH and threshold autoregressive models (non-linear models are geared toward asymmetrical data and data with outliers, and are ill-suited for multiple step-ahead forecasts). We also did not review non-parametric approaches such as neural network forecasting, as many do not have explicit models (that is, they are black boxes) and often result in specifications similar to the approaches we cover, but would do poorly according to many of our criteria. In practice, these approaches have met with limited success when applied to economic data (Stock (2002) suggests that the biggest excluding factor is small data sets in economics, whereas these methods are data-intensive).

5. Formally, a random walk model for housing prices, P t , would be: equation. A useful forecast for a random walk is equation , provided the random shock, e t , is zero, on average.

A moving average model is defined as:


where k is the number of periods over which the average is taken. For k = 1, the moving average is equivalent to using the last observation as the forecast. For k = t, the moving average uses all the observations and is equivalent to the historical mean (Makridakis et al., 2008).

6. A common generalised exponential smoothing approach is the Holt-Winters method. See Makridakis et al. (2008) for more on its theoretical derivation and application.

7. This has been simplified. The Moro and Nuño model allows for capital and labour to be weighted by their intensity in production (α i), where i = {c,g} and c is the construction sector and g is the general economy. The full specification, in continuous time notation, is:


8. See Box C1 in Budget 2007, Economic and Fiscal Strategy Report and Financial Statement and Budget Report available at

9. See Forecasts for the UK economy: various months, available at:

10. See Backgrounder-Canadian Economic Outlook, available at:

11. For example, see the discussion on oil prices in the Bank of Canada's July 2014 Monetary Policy Report, available at:

12. For example, the Investment Property Forum ( IPF) has a number of survey products available:

13. The rules of thumb based on moving averages and exponential smoothing models described in Subsection 3.1 are special cases of the model class described here. However, in the context of this section they are not necessarily fixed and mechanical, and require greater analyst effort and judgment in their specification.

14. The AR component of the model was first introduced by Yule (1926). Slutzky (1937) described MA models. Wold (1938) combined the AR and MA models and has shown that ARMA processes can be used to model all stationary time series as long as the appropriate number of AR terms and the number of MA terms are properly specified. Box and Jenkins (1976) popularized the ARIMA models and created a structured model selection process.

15. The first difference of x t ( Δ x t) is formed by subtracting the function in the previous period from the function in the current period: equation

16. A variable is integrated of order I(1) if differencing renders it stationary. It is integrated of order I(2) if it must be differenced twice to render it stationary.

17. Formally, a Markov process's future and past are independent, conditional on its current state.

18. In this context, policy analysis means estimating the impact of changes in housing market policy on the wider economy and vice versa. The direct impact of policy changes relating to tax rates and thresholds on tax revenue can still be assessed by combining univariate models which forecast average prices and transactions with house price distribution models, such as using a standard distribution ( e.g. the log-normal distribution used in Scottish Government forecasts) or micro-simulation techniques/micro-databases, as discussed in Subsection 3.8

19. Multivariate econometric regression models are often called behavioural models. When looking at average house prices and total transactions, behaviour refers to the relationships at an aggregate, economy-wide level. This contrasts with behavioural equations at the individual household level, for example modelling the impact of mortgage interest rates on decisions on house size. Modelling of the aggregate household sector often assumes that these same decisions can be observed in macroeconomic relationships.

20. This is a heavily stylized model based on equations (2) and (5) of Tsolacos (2006).

21. The asset approach can be quite involved to implement in practical terms. There are simpler alternatives. For example, the housing affordability approach models house prices by examining the current affordability of housing relative to its long-run trend. The measure could be the ratio of per capita household income to the average house price or the ratio of mortgage interest payments (and capital repayments) to disposable income. If these ratios get too out of line with their historical trend, there should be pressure to move back toward the trend (Brooks and Tsolacos, 2010).

22. Usually in economics, markets are modeled as the quantity demanded at different prices. An inverted demand function solves the system differently to show the price level for a specific quantity demanded.

23. Structural models allow contemporary effects of variables on one another, whereas reduced-form equations express dependent variables as a function only of lagged values of itself and other independent variables. A reduced-form model is usually derived from a structural set of equations through algebraic manipulation and makes estimation easier.

24. Technically, they use 'stacked' OLS regressions, which is estimating for each city but constraining all coefficients to be the same across cities.

25. The full specification in Auterson (2014) extends this model to an error-correction framework.

26. Specifically, they suggest that no variables in the economy can be taken as exogenous.

27. This structure permits all variables to be treated as jointly endogenous ( y is determined by x and x is determined by y).

28. As discussed in Subsection 3.2, variables that trend over time with non-constant variance are called "integrated". When variables are integrated, hypothesis tests are not valid (the t-statics and F-statistics do not conform to the t- and F- distributions).

29. Technically, if variables have returned to their long- run steady state, then the first difference:


does not have a solution: the changes in x and y are zero and the terms drop out.

30. De Vroey (2016) provides a useful history.

31. Depending on the National Accounts specification, households can also include unincorporated businesses.

32. The investment is driven by the difference between the rental price of housing services and the costs of those funds (interest rates). This is known as a Q-ratio.

33. Granger and Newbold (1986) provide a summary of several studies that demonstrated that univariate time-series models can outperform large-scale macreconometric models.

34. See the ITEM team website at:

35. As most survey data focuses on individuals, there are few microsimulation models for businesses; however, some simple micro data models make use of Corporate Income Tax returns.

36. This will be the case for personal income taxes even if tax thresholds are grown with inflation as a result of real income growth.

37. Exceptions include HMRC's SDM+, SustainCity (Morand, Toulemon, Pennec, Baggio, and Billari, 2010). and Australia's DYNAMOD I & II.

38. The authors find that the stimulus was "enormously" successful, increasing transactions volumes by as much as 20 per cent in the short run. This magnitude was partly the effect of moving transactions forward, but persistent (though smaller) effects were also observed over the medium term following the end of the policy.

39. This contrasts with the traditional frequentist approach of estimating all parameters from historical relationships. Gelman, Carlin, Stern, and Rubin (2004) refer to the traditional approach as retrospective evaluation.

40. This is done, for example, by setting normal prior distributions with zero means and relative small standard deviations as the lags increase.

41. HMRC's model is programmed in the General Algebraic Modelling System ( GAMS) software using the mathematical programming system for general equilibrium ( MPSGE) solver.

42. In addition to probit models, Larson (2011) found that error correction models of housing prices in California before the financial crisis could forecast large declines before they were observed (although the ability to predict the timing was poor).

43. See the discussion on 2015-16 non-residential LBTT revenues in the Draft Budget 2017-18 Devolved Taxes Methodology Report, available at


45. For example, this forecast error decomposition plays an important role in HMRC's Measuring tax gaps reports (various). The 2016 report is available at:

46. The OBR uses the term "imposed" instead of "exogenous", making the distinction that the forecasts have been determined outside of the model, but using the model's solutions, so are endogenous but done off-model for practical reasons.

47. Available at:

48. Pike and Savage (1998) provide an overview of the way in which HMT macroeconomic model forecasts of taxes and tax bases are combined, compared, and constrained to the more detailed tax revenue forecasts of HMRC. Although the paper is dated, practitioners report that the process has remained largely unchanged (with the exception of scrutiny by the OBR).


Email: Jamie Hamilton

Phone: 0300 244 4000 – Central Enquiry Unit

The Scottish Government
St Andrew's House
Regent Road

Back to top