How to start forecasting in BigQuery with zero training
If you’d told me five years ago that I’d be forecasting product demand using a model trained on 100 billion time points… without writing a single line of ML code… I probably would’ve asked how many coffees you’d had that day ☕️
But its a brand new world. And it’s possible. Let me explain
What is TimesFM?
TimesFM is a new foundation model from Google, built specifically for time-series forecasting. Think of it like GPT for time, instead of predicting the next word in a sentence, it predicts the next block of future time-points based on patterns in the past.
It’s been trained on a massive dataset (we’re talking 100 billion real-world time-points from sources like Google Trends and Wikipedia views). So it’s picked up all sorts of general behaviours you’d expect in time-based data like trends, seasonality, and sudden spikes or drops.
But how can one model work across everything?
That was my first question too.
I assumed forecasting sales for a jumper would be wildly different from predicting hospital admissions or traffic volumes. And in many ways, it is. But the magic here is that most time-series, no matter the industry, follow common patterns:
- Peaks and troughs over weeks or months (seasonality)
- Gradual changes over time (trends)
- The fact that yesterday often affects today (autocorrelation)
TimesFM has learned to spot these behaviours from its pretraining. So when you give it new data, it already knows how to look for them.
That’s why it can do zero-shot forecasting, meaning it can predict for new datasets it’s never seen before, without any fine-tuning or custom training.
Wait, can I add business-specific info (like promotions or holidays)?
This was one of my follow-up questions, and I want to be super clear:
With AI.FORECAST()
in BigQuery, you can’t directly feed in extra business context like holiday flags, promotion periods, or price changes. The model only sees your chosen timestamp, value, and ID columns.
So while the model can pick up things like seasonal spikes (say, from previous Black Fridays), it won’t know what’s coming next unless your historical data already shows a pattern.
You can absolutely interpret forecasts with business context in mind, or manually adjust afterwards, but it’s not part of the model input right now.
How AI.FORECAST()
Decides Whether to Forecast by Day, Week or Month
This is something that caught me out at first, so worth knowing before you run the code.
When you use AI.FORECAST()
in BigQuery, you don’t explicitly set the forecast frequency. Instead, it looks at the gaps between your timestamps and makes an educated guess about what granularity to use.
Here’s what that usually looks like:
Spacing Between Dates | Forecast Granularity |
---|---|
1 day apart | Daily forecast |
7 days apart | Weekly forecast |
30 days apart | Monthly forecast |
1 year apart | Yearly forecast |
Sounds handy (and it is), but if your data is a bit patchy or has gaps (which GA4 often does), the model might misinterpret the granularity and give you a weird result (like 30 weeks instead of 30 days).
How to Take Control of the Granularity
If you want to be sure the model knows it’s working with daily data, you can fill in the gaps yourself before forecasting.
In the example below, we use GENERATE_DATE_ARRAY()
to build a full daily calendar, then join that to our real data, filling missing days with zeros. This gives us a nice, regular dataset that makes it clear we want a daily forecast.
Once your data is consistent, AI.FORECAST()
will honour that pattern and give you 30 days of output (rather than 30 weeks or 30 months).
Alternatively if we want weekly, or monthly data we might adjust the code to aggregate the data up to week, month or year. This will help AI.FORECAST()
to understand what you are trying to do.
Creating Your First Forecast in BigQuery
Here’s how to run a TimesFM-powered forecast using your GA4 data. In this example, we’re forecasting daily purchase quantities per product using the public GA4 ecommerce dataset.
We even filled in missing dates (because if your data doesn’t show up every day, BigQuery might decide it’s weekly or monthly instead which can throw off your forecast!).
Here’s the code:
-- STEP 1: Generate all calendar dates in the range
-- this will ensure you have data for every day even if its 0
WITH calendar AS (
SELECT day AS event_date
--to_do change the date range to the date range of data you are extracting in the next step
FROM UNNEST(GENERATE_DATE_ARRAY('2020-11-01', '2020-12-31')) AS day
),
-- STEP 2: Summarise daily quantity by item
-- to_do this is where you define the data you are interested in forecasting
qty_by_name AS (
SELECT
PARSE_DATE('%Y%m%d', _TABLE_SUFFIX) AS event_date,
items.item_name,
SUM(items.quantity) AS daily_qty
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,
UNNEST(items) AS items
WHERE _TABLE_SUFFIX BETWEEN '20201101' AND '20201231'
AND event_name = 'purchase'
GROUP BY event_date, item_name
),
-- STEP 3: Join to calendar to fill in missing dates (assumes zero sales)
qty_filled AS (
SELECT
c.event_date,
i.item_name,
IFNULL(q.daily_qty, 0) AS daily_qty
FROM (SELECT DISTINCT item_name FROM qty_by_name) i
CROSS JOIN calendar c
LEFT JOIN qty_by_name q
ON q.item_name = i.item_name AND q.event_date = c.event_date
),
-- STEP 4: Forecast next 30 daily values using AI.FORECAST()
forecast_results AS (
SELECT
'forecast' AS data_type,
item_name,
CAST(forecast_timestamp AS DATE) AS event_date,
forecast_value AS value,
confidence_level,
prediction_interval_lower_bound,
prediction_interval_upper_bound,
ai_forecast_status
FROM AI.FORECAST(
(SELECT * FROM qty_filled),
data_col => 'daily_qty',
timestamp_col => 'event_date',
id_cols => ['item_name'],
#this might relate to 30 days, weeks, months, year depending on the data that is ingested
horizon => 30
)
),
-- STEP 5: Append historical actuals
actuals AS (
SELECT
'actual' AS data_type,
item_name,
event_date,
daily_qty AS value,
1 AS confidence_level,
CAST(NULL AS FLOAT64) AS prediction_interval_lower_bound,
CAST(NULL AS FLOAT64) AS prediction_interval_upper_bound,
CAST(NULL AS STRING) AS ai_forecast_status
FROM qty_filled
)
-- FINAL OUTPUT: forecast + actuals combined
SELECT * FROM forecast_results
UNION ALL
SELECT * FROM actuals
ORDER BY item_name, event_date;
Code language: PHP (php)
This gives you a tidy table of past data and forecasted values, ready to plot or join into a dashboard. You’re welcome!
Key Takeaways
- TimesFM can forecast new datasets without training, by learning general time-series behaviours.
- It works across industries because many time-based patterns (like seasonality and trends) are universal.
- BigQuery’s
AI.FORECAST()
uses TimesFM under the hood — no setup or model training required. - The model infers whether to forecast daily, weekly, or monthly based on your input timestamp spacing.
- You can’t add extra context directly into the model, but you can still interpret your results with that in mind.
Want the official documents? Here you go…https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-forecast
If you’re working with GA4 in BigQuery and want to get forecasting fast, give this a go. It’s oddly satisfying to see a model pick up on patterns you didn’t even spot yourself!
If you have any questions, feel free to reach out to Measurelab here.