To produce the analysis results, we interrogated the full set of raw data hosted on embed. In the course of our analysis we discovered that we first had to write code to convert the raw data into usable data. From this page, you are invited to download the full dataset of usable data and the code we used to extract and analyse the data from the Embed database.
Download the Retrofit dataset
The usable dataset includes all the energy (gas and electrical) meters used to generate the charts. The data is in 5 minute time intervals, which was the interval specified for the meters in each retrofit property.

Purpose of the dataset

We expect that this dataset will be a key resource for conducting your own research.
Here are a few examples of the sort of questions we think the dataset could be used to research:
  • How significant are the occupants to the annual energy consumption of these homes?
  • Are there correlations in the retrofit performance outcomes in properties with different packages of measures?
  • Can occupancy routines be accurately inferred from energy consumption data recorded at a five minute resolution?
  • What could be the implications for the UK electricity grid, if there were a mass transition in retrofitted homes from gas heating to electric forms of heating?
  • Is there any evidence for unintended negative consequences of deep retrofit of homes?
File structure

The main download is a single folder, containing a subfolder per property and a file per energy meter. The files contain the Technology Strategy Board property ID, property UUID, device ID, timestamp, value, units, sensor type and period type for each energy meter.
The separate metadata table shows the property, device ID, device type, and measurement units for each energy meter.
There is also a property area file which has the number of square metres for each property so that emissions per area can be calculated. This file shows the property, with the Technology Strategy Board and leading zeros stripped off and the area in square metres in csv format..
The combined files are in a compressed .tar.gz format. The compressed file size is roughly 53 MB, uncompressed it is 1.6 GB.

Advice on format

All files are in tab-separated variable format, which is a standard machine-readable form.
If you have trouble reading them in Excel, please use the "Text to Columns" command to specify that the columns are separated by tabs.
If reading them into R, use read.csv and set sep='\t'.


The Embed database is made available under the Open Database License.
Any rights in individual contents of the database are licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported license.

