Scottish housing market: tax revenue forecasting models – review

Findings of an independent literature review of tax revenue forecasting models for the housing market.

2. Methodology and evaluation criteria

To identify candidate approaches, we searched peer-reviewed journals and the research of public and private sector forecasters. We also interviewed experts on housing market forecasting and budget preparation in several OECD governments, independent fiscal scrutiny bodies, and central banks. We narrowed our evaluation to the model classes in Section 3 based on these initial findings. We then selected the literature that was most relevant to Scotland to support a systematic assessment of these model classes.

The criteria against which we evaluated each model class are listed in Table 1. We first assessed whether the model was likely to be suitable for the Scottish housing market. We divided the application criteria into two components: 1) forecasting, and 2) policy analysis. The distinction between these two objectives is important for model selection and is discussed in detail in Box 1.

Assessing accuracy is difficult without developing the models themselves and comparing their out-of-sample forecasting properties to the current approach. As will be shown, research suggests that the accuracy of different model classes depends on the unique circumstances under which the model is asked to perform (that is, the forecast period, geographical region, and position within the business cycle). We were able, however, to provide indicative evidence from comparative studies. Because models tend to have different forecasting properties over the short and medium term, we attempted to find evidence on performance in the first eight quarters and the last three years of the intended five-year forecasting horizon. This breakdown could be useful for combining models.

We then assessed several elements of the model's ease of communication, included whether the model can tell a convincing story for both its current path and revisions since the last forecast round, and whether it is likely to be transparent to stakeholders.

We also ranked the model on whether its data requirements are likely to exist and be met in a timely matter, and the likely use of resources the model will require to develop, run, and maintain over time.

Each criterion was assigned a summary score of good, fair, or poor, based on the most likely scenario we can foresee without carrying through the model development itself. Results may vary in practice.

Box 1: forecasting models versus policy models

A first question to ask is: what will the model be expected to do? If it will be required only to forecast baseline revenues, a wide variety of forecasting approaches are possible. However, forecasting models in a public budget setting are often asked to do other analysis, broadly defined as policy analysis. Policy analysis includes:

  • Fiscal impact costing. The model may be asked to assess the fiscal cost (that is, the increase or decrease in government revenues) of changes in housing market policy or changes in LBTT policy, such as rates and thresholds.
  • Scenario analysis. The model may be asked to assess the impact on housing demand of changes to other economic and fiscal assumptions (for example, stricter financial regulation or higher social assistance for lower income households).
  • Risk analysis. Budget forecasters often publish a table of fiscal sensitivities or risks to the outlook-that is, how a change in GDP, inflation, or other inputs may affect revenues. This is useful, for example, if a minister is interested in how LBTT revenues would change if the outlook for economic growth were to fall below the budget's assumption, or if the central bank were to miss its inflation target.

Policy and risk analysis requires models that have been developed to capture the underlying process that generates the data. This largely rules out, for example, univariate forecasting models that make no attempt to describe how the housing market is influenced by other economic and fiscal variables of interest (and therefore cannot use alternative assumptions to compute costings, scenarios, and fiscal sensitivities).

Forecasting and policy models place different emphasis on the intertemporal dynamics of the data (the way that past values influence current and future values). Forecasting models place the importance of observed intertemporal properties and dynamics above all else. Policy models, on the other hand, place importance on economic and statistical theory to drive the equation specification. By emphasizing theory, policy models sometimes miss out on potentially useful forecasting information in the dynamics of the data.

We did not attempt to assign weights of relative importance to the criteria, and their interaction is not straightforward. For example, a model that is specified to easily attribute forecast errors to economic determinants (that is, it scores highly on the communication criterion) is likely to require forecasts of exogenous economic variables. Forecasting exogenous variables may require hiring more analysts (reducing the model's resources score) and may introduce additional uncertainty (reducing the model's accuracy score). Scores should therefore only be taken as indicative and should not be considered in isolation.

Table 1: Assessment criteria





This score reports whether examples of that model class are common in the academic and practitioner forecasting literature for housing prices and transactions. Further, model classes that can produce ex ante forecasts (that is, unconditional forecasts without auxiliary forecasts of exogenous variables) will score well. Models that require auxiliary forecasts of exogenous variables will score in the mid-range (fair) and models that are not particularly suited for forecasting the housing market for tax purposes will score poorly.


If a model is a poor choice for forecasting, it may still be useful for policy. The purpose of policy models ranges from costing tax changes, assessing government interventions, dynamic scoring (estimating the feedback effects of fiscal policy on the economy), and scenario analysis (changing assumptions such as immigration levels or the exchange rate). Models that are structural (have been specified with an economic theory in mind) will score well.


short run

(quarters one to eight)

Forecast accuracy can be measured using several different forecast assessment statistics. Although we did not develop the models and test their accuracy ourselves, we attempted to find comparative studies and rank general forecast conclusions relative to other model classes. A good score means models generally performed better than their peers (and poor, worse). A fair score suggests that evidence was mixed. Because models have different properties over different horizons, we split this into two assessments: one for the short run (the first eight quarters of the outlook) and one for the medium run (years three to five of the five-year outlook).

medium run

(years three to five)


story telling

Forecasters are often asked to explain their model's outlook to policymakers and other stakeholders, and explain any revisions since the previous forecast round. Models with a theoretical basis for the specification and direct causal interpretation will score well on this criterion. Models that are mostly a black box using historical statistical properties will score poorly. Additionally, the forecast for a budget line item does not only need to tell an intuitive story in its own right, it must also be presented and framed within a narrative of the government's views on the economy. The model should therefore also be capable of being integrated into the budget framework in a consistent manner. A model that is sufficiently specified to include economic determinants from the macroeconomic outlook will score well on this criterion.


When a model's assumptions are clear and transparent, it reassures stakeholders that forecasts are credible and assists external scrutiny. This criterion rates whether the model lends itself to transparent reporting of assumptions and whether it can be scrutinised readily by someone with a general economics background (but not necessarily an advanced specialist degree). A simple model that limits the role of judgment will score well on this criterion. More complex models involving considerable judgment, or requiring advanced degrees to interpret will score poorly.

Data compatibility

Most Scottish data on housing and other potential explanatory variables is available on a quarterly basis back to at least 2003 or the mid-1990s, depending on the level of aggregation. This criterion assesses compatibility of the model with 50 to 150 observations of quarterly data in a timely fashion. [3]


The resources, or cost, of the model, are related to its complexity and the specialised background it requires. This criterion rates the number and type of analysts required to run and maintain the model, the learning curve, and the ability for the work to be split and allocated within a team. A good score implies the model requires few team resources in proportion to the housing market's overall importance to the budget. A fair score implies the model could be implemented with resources proportionate to the role of the housing market in the budget. A poor score implies the model would require significant investments and greatly exceed the relative importance of the housing market in the overall budget.


Email: Jamie Hamilton

Phone: 0300 244 4000 – Central Enquiry Unit

The Scottish Government
St Andrew's House
Regent Road

Back to top