The derelict data warehouse, revisited: why this problem just became existential
AI doesn't fix bad data; it scales it. In 2026, a derelict data warehouse isn't just a nuisance, it’s an existential risk.

Right, now that some of the basic concepts and industry jargon are starting to sound a bit more familiar, things are starting to get interesting – not that last week was boring but this week I’ve been having a go at one of the most important parts of a piece of “initial discovery” for a client: A Google Analytics Health Check.
In this document, which the team creates for every single one of our clients, we run a detailed audit of the client’s data collection and report configurations - and make all the necessary recommendations to ensure that:
All the above might sound pretty obvious but there are so many little details (e.g. snippets of JavaScript code, settings here and there) that one needs to get right for the whole thing to work.
So, to cut the chase, what do we normally include in a Health Check?!
(drumroll please)
The contents of each Health Check will normally depend on a client’s measurement requirements and specific business needs…BUT these are the bits that are normally covered:
Here, we try to answer questions regarding the general implementation of the account. For example:
The aim in this section is to clarify whether the data is being collected accurately and appropriately so it can be processed and analysed properly later on. I can’t tell you how many times this week I’ve been asking the team about JavaScript codes, cookies – the internet ones – and so on.
Here, we list the Web Properties that have been created or we have been granted access to and review some of the most important aspects each Web Property - for example:
So, basically, it’s about fine-tuning the data gathering! The more data we have, the better, as we can always worry about filtering during the next steps.
Great, so the data is coming in nicely, now we just need to tidy it up, make sure that we are only getting what we need.
By now we should have a clear idea of the state of the account and whether there are any issues that need to be addressed urgently. Ensuring that the data is being gathered appropriately is probably the top priority during the first steps of the health check, as this will guarantee that once the Measurement Plan starts to take shape, we have as much information as possible to meet our needs.
And that’s it for now folks! Next week I’ll wrap it up by covering the final steps of the Health Check. In the meantime, if you have any questions, comments – or advice – please do not hesitate to get in touch or leave a comment below. :)
AI doesn't fix bad data; it scales it. In 2026, a derelict data warehouse isn't just a nuisance, it’s an existential risk.
AI's revolution is already here. We're too busy debating its future to harness the transformational power of today's models.
NL-SQL tools can transform how you query data - but only if your foundations are solid. Here's an honest look at what works and what doesn't.