Skip to main content

Easy ways to prepare your BigQuery warehouse for AI

Katie Kaczmarek23 April 20254 min read
Easy ways to prepare your BigQuery warehouse for AI

You’ve probably heard that AI is coming to make our lives easier, especially in tools like BigQuery. But here’s the thing: AI isn't magic. If you want it to be accurate and useful, you need to set it up for success.

One of the best ways to do that? Improve the metadata in your BigQuery warehouse.

Metadata is like the index or contents page in a book, it quickly tells you exactly what’s inside and where to find it. Creating clear metadata means AI can more easily understand your data warehouse and give you more accurate, relevant results.

Here’s how you can easily start improving your warehouse metadata today:

  1. Clearly describe your tables and fields (what they contain and why they're useful).
  2. Link related tables together (like connecting dots) so it's obvious how they're related.
  3. Tag your tables with keywords to help you and AI find them quickly.
  4. Keep everything neat and consistent, using the same approach everywhere.

1. Add descriptions to your tables and fields

Currently you either have to do this manually (yes, typing away) or use an open source tool, like the one we developed here at Measurelab. There's good news however: BigQuery's new Data Insights feature should be able to automatically generate table and column descriptions, simplifying the documentation process.

Why bother?

Imagine AI trying to figure out what "cust_id" or "rev_tot" stands for. Field and table descriptions are like name tags at a party; the clearer they are, the smoother the introductions go. Good descriptions mean AI spends less time guessing and more time providing accurate insights.

Quick wins (A term I hate in SQL requests):

  • Add simple, clear field and table descriptions
  • Include example values where it helps
  • Use Data Insights to scale it across your warehouse
  • Link related tables clearly

Think of your datasets like a family tree, everything is connected. Clearly defining relationships between your tables makes it easier for AI (and your team!) to understand how everything fits together.

Simple steps:

  • Clearly document relationships in your table descriptions.
  • Identify joining keys in your field descriptions

3. Tag your tables for quick discovery

Tags are your shortcut for finding data fast, think of them like labels on folders or bookmarks in your browser. They’re great for quick filtering and navigation, helping you (and AI) pinpoint exactly what you need without wasting time.

My advice?

  • Think practically. Tags should reflect actual usage, like "finance", "marketing", or "PII".
  • Be strategic and consistent in tagging, making it easy to sort and access your tables quickly.

4. Use clear, consistent naming conventions

Good naming conventions aren’t just nice, they're essential. Clear names prevent future confusion, save time, and make your warehouse easier to navigate for everyone, especially AI.

Imagine naming your files "final_final_3.csv" in your Google Drive. Finding anything later would be a nightmare! Clear, consistent names like "sales_data_jan2025" help AI understand exactly what’s in each table.

Bonus Tip: Create report-specific tables

If you want to go one step further, consider building core reporting tables designed to make analysis easier.

Think of these as simplified, centralised “AI-friendly” versions of your data. They can also act as your single source of truth for dashboards and recurring reports.

Examples of useful AI-ready tables:

  • core_metrics_summary: daily or weekly KPI snapshots
  • user_engagement_core: simplified GA4-style user data
  • product_performance: clean sales data by product
  • customer_lifetime_value: key user value metrics
  • data_dictionary_ai: a table AI can refer to for definitions and aliases

These tables make it easier for AI tools to produce reliable outputs and for people to report from the same base data.

Quick recap (because we love simplicity):

  • Add clear field and table descriptions using BigQuery's Data Insights.
  • Clearly link related tables together.
  • Use strategic tags for easy navigation.
  • Implement consistent and understandable naming conventions.
  • Consider creating core reporting tables to support AI and your team.

Small efforts now will make AI a more helpful, accurate assistant for everyone in your team, coders and non-coders alike.

What are you doing today to get ready for AI in BigQuery?


Suggested content

Measurelab awarded Google Cloud Marketing Analytics Specialisation

At the start of the year, if you’d asked us whether Measurelab would be standing shoulder to shoulder with Europe’s biggest consultancies by September, we would've been surprised. Not because we don't believe in ourselves, but because these things feel so distant - until suddenly, they’re not. So, here it is: we’ve been awarded the Marketing Analytics Services Partner Specialisation in Google Cloud Partner Advantage. What’s the big deal? In Google’s own words (with the obligatory Zs): “Spec

Will Hayes11 Sept 2025

BigQuery AI.GENERATE tutorial: turn SQL queries into AI-powered insights

BigQuery just got a major upgrade, you can now plug directly into Vertex AI using the new AI.GENERATE function. Translation: your analytics data and generative AI are now best friends, and they’re hanging out right inside SQL. That opens up a whole world of new analysis options for GA4 data, but it also raises some questions: * How do you actually set it up? * What’s it good for (and when should you avoid it)? * Why would you batch the query? Let’s walk through it step by step. Step 1: H

Katie Kaczmarek3 Sept 2025

How to start forecasting in BigQuery with zero training

If you’d told me five years ago that I’d be forecasting product demand using a model trained on 100 billion time points… without writing a single line of ML code… I probably would’ve asked how many coffees you’d had that day ☕️ But its a brand new world. And it’s possible. Let me explain What is TimesFM? TimesFM is a new foundation model from Google, built specifically for time-series forecasting. Think of it like GPT for time, instead of predicting the next word in a sentence, it predicts t

Katie Kaczmarek14 Jul 2025