This item is archived. Information presented here may be out of date.

Changing VATitudes towards admin data methods

Natasha Bance

Having a master’s degree in economics and experience in the telematics industry, you can understand my excitement to put my knowledge to work on economic methods research when the Methodological Research Hub was established earlier in the year.

The Office for National Statistics (ONS) is committed to addressing the strategic recommendations made in the Independent review of UK Economic Statistics by Professor Sir Charles Bean. To help reach these recommendations, a team was formed within the hub to select methods for research and development, to enhance usage of administrative data in economics across ONS.

From our research and initial consultation across the Economic Statistics Group (ESG), a recurring priority was utilising HMRC Value Added Tax (VAT) data, a large and widely applicable dataset offering many opportunities as well as challenges.

As with any administrative source, VAT data are designed for operational purposes. However, the data can be useful for statistical work, as they capture turnover across time, alongside other variables similar to those captured in the Monthly Business Survey. These data have applications across several areas including National Accounts and Gross Domestic Product (GDP) estimates. With a much larger sample size than would be possible with surveys, VAT data enables more granular analysis of industry groupings.

As well as increased coverage of businesses, using the VAT data has numerous other benefits:
• potential to reduce response burden on smaller businesses
• increased frequency of statistics
• more granular financial outputs
• exploration of new opportunities for statistical outputs
• sub-national coverage, Regional GDP

ONS already makes great use of VAT data, with a pipeline in place to ingest regular data from HMRC, compiling a time series of turnovers for businesses. However, whilst there are already well-established methods for cleaning, preparing, and estimating using business survey data, the methods need to be adapted to be suitable for usage on VAT data.

Based on talks with data users across ESG, our economics research team are working with the VAT team to investigate methods for:
data cleaning and automatic editing rules – Identifying and adjusting systematic and reporting errors
improving timeliness of returns – Investigating the impact of delayed returns on accuracy and forecasting for delayed businesses to produce earlier outputs.
calendarisation – Finding a suitable method to adjust quarterly returns to create accurate monthly estimates whilst representing seasonal fluctuations in outputs.

Recent economic shocks from the coronavirus pandemic have brought issues across many statistical outputs, which if incorrectly handled could lead to misreporting. These shocks also highlighted problematic economic conditions for the existing VAT data methodology.
Analysis using the thousand-pound rule, a rule looking for response errors where a value in thousands has been given that should be in whole pounds, found variation in the rate of edits across industry and season, and with an increased edit rate during lockdown periods. These patterns suggest that whilst this rule is generally achieving the desired effect, some improvements can be made to the editing rules for more volatile industries or when under specific economic conditions.

Early analysis using the quarterly pattern rule, a rule which searches for patterns in a businesses’ turnovers that are likely misreported quarterly values, has shown reporting issues are more common than you might expect, and require a trusted methodology to make adjustments so that abnormal patterns don’t affect industry level estimates or misrepresent specific periods.
The consequence of altering the order in which cleaning rules are applied is also of interest, and further work is on-going to examine the overall impact of cleaning methods on the quality and noise effects of ‘true’ and ‘false’ edits.

The team is also exploring use of outlier detection methods in place of editing rules, but from initial methods testing we found the methods unsuitable due to the skew of business turnovers in VAT data.
We are also exploring the application of seasonal autoregressive integrated moving average (S/ARIMA) models as a method of forecasting to enable shorter delays between periods of interest and data releases.

With initial analysis highlighting potential areas for further research and method development, over the coming months my colleagues and I will be working with the VAT team to assist and provide insight for upcoming revisions to the VAT pipeline.

If you would like to know more about the VAT research, or the wider research of the Methodological Research Hub, or if you are interested in collaborating with us, please contact

Tom Tarling
Natasha Bance
Tom Tarling is a methodologist working within the Methodological Research hub. His background is in economics and statistics, with experience in the telematics industry. His current work includes Editing and Imputation methods, and Administrative Data Methods Research.