#131 From raw export to insight-ready data (with Johan van de Werken at GA4Dataform)
Johan van de Werken joins The Measure Pod to discuss GA4Dataform, BigQuery, and the future of analytics engineering and automation.
All about Dataform (23 posts)
Johan van de Werken joins The Measure Pod to discuss GA4Dataform, BigQuery, and the future of analytics engineering and automation.
In this 30-minute webinar, learn how to bring structure and sanity to your SQL workflows, without adding complexity. We cover practical
In this episode of The Measure Pod, Dara and Matthew sit down with Gunnar Griese from 8-bit-sheep to discuss MCPs and their growing role in digital analytics.
The Springer Nature Group is an academic publishing company, with brands dating back to 1842, that advances scientific discovery by publishing robust and insightful research, supporting the development of new areas of knowledge, making ideas and information accessible around the world, and leading the way on open access. The challenge Springer Nature needed to migrate their Universal Analytics dashboards to GA4 data. Their reporting relied on multiple stacked scheduled queries in BigQuery tha
In this episode of The Measure Pod, Dara and Matt are joined by Verónica Delgado-Benito to chat through her journey from molecular biology PhD to data analyst at Springer Nature.
In our recent engagement with a client, we went on a journey to transform their data pipelines, tackling inefficiencies in performance and cost within their Google Cloud BigQuery environment. Our efforts culminated in a comprehensive optimisation strategy that used Dataform, improved SQL practices, and implemented tailored solutions for significant performance gains and cost savings. Here’s a deep dive into the highlights of our project. Identifying inefficiencies in BigQuery workflows We beg
Dataform is a powerful tool for managing your data workflows in a structured, version-controlled, and automated way. Whether you're a beginner or an experienced data engineer, Dataform simplifies SQL-based transformations while integrating seamlessly with Google BigQuery. Although this blog offers a basic introduction to Dataform's functionality, users can achieve significantly more with Dataform. From advanced scheduling, parameterised queries, and dependency management to complex data modelli
In this episode of Behind the Cloud, Matthew dives into the details of releases and scheduling in Dataform. He breaks down how to manage different versions of your codebase in GitHub. From taking snapshots, to scheduling executions at various intervals daily, hourly, or monthly. By the end of the episode, you’ll have the know-how to confidently release and schedule your code, making it easier to build robust tables and models with Dataform. Video transcript Introduction to releases and sche
Efficient data loading is crucial for managing and updating tables in Dataform. Various strategies exist to handle different use cases, including truncate and load, appending data, and leveraging incremental tables with unique keys. This blog explores these primary methods and more: Truncate and Load In this method, all existing records in the target table are deleted and replaced with a fresh table. This approach works well when a full table refresh is necessary or if managing slowly changin
Managing scheduled queries in BigQuery often feels limiting — there’s no version control, no easy collaboration, and scaling can be difficult. If you’ve ever wondered how to make SQL workflows smoother, Dataform is your answer. In this post, I’ll show you how I migrated a BigQuery scheduled query to Dataform and how it transformed the way I manage my data pipelines. After all, we all want to know who’s been touching our queries, don’t we? Getting started in Dataform First thing you need to
Setting up a Dataform repository can be challenging without the right steps. Whether you’re new to Dataform or want to optimise your workflow, this guide will show you how to seamlessly connect it with GitHub and Google Cloud (GC). What is Dataform and why use it? Dataform is a powerful tool for managing version-controlled SQL workflows in a collaborative way. GC incorporates BigQuery and GitHub integration, providing an efficient way to organise and maintain complex data pipelines. Let’s bre
Discover 'The Complexity Paradox': why seemingly simple tools like spreadsheets can lead to hidden complexity, while learning powerful tools like BigQuery and SQL can bring true simplicity and scalability to your data workflows.
Learn how Dataform can solve the challenges of outdated SQL, improving data reliability, scalability, and efficiency for marketing teams.
The Springer Nature Group is an academic publishing company, with brands dating back to 1842, that advances scientific discovery by publishing robust and insightful research, supporting the development of new areas of knowledge, making ideas and information accessible around the world, and leading the way on open access. The challenge The sales and marketing teams depended on incomplete data, which didn’t capture the entire customer journey due to different systems in use. Transactions and re
In this week's episode of The Measure Pod, Dan and Bhav are joined by Ken Williams from DiveTeam to discuss modern marketing and measurement strategy.
In this episode of Behind the Cloud, Matthew demonstrates how to enable and set up a Dataform project within BigQuery, connect it to GitHub, and initialise the workspace for building a Dataform project. Matt walks us through enabling BigQuery, creating a repository, setting up the region, and using service accounts. Video transcript Introduction to Dataform in BigQuery [00:00:00] Matt: Hello and welcome to this week’s behind the cloud sticking with the practical theme today. We’re going to
In this episode of Behind the Cloud, Matt discusses Dataform, what it is, and why it matters. Video transcript Cloud Data Warehousing [00:00:00] Matt: Welcome to Behind the Cloud. Today we’re exploring data form, but first a little bit of scene setting. Over the past number of years, cloud computing, specifically cloud data warehousing, has advanced significantly. Huge amounts of data can be queried in seconds. The scalability of the platforms is near infinite from both a performance and a
In this week's episode of The Measure Pod we spoke with David Jayatillake, co-founder of Delphi and organiser of the London Analytics Meetup.
In this episode of Behind the Cloud, Matt discusses the essentials of Google Cloud’s BigQuery. Everything from project structure, data handling, to understanding the costs involved. Video transcript [00:00:00] Matt: Hello and welcome to Behind the Cloud. Today we’re going to be diving into the nuts and bolts of Google Cloud’s BigQuery and how it can help to revolutionise your marketing analytics. Whether you’re really familiar with the cloud or this is all new to you, this episode aims to gui
In this episode of Behind the Cloud, Matthew aims to answer the question: what are the Google Cloud Platform tools of the marketing analytics trade? And more specifically, what are the tools that you should care about in Google Cloud. For a more in-depth write up on Google Cloud tools, check out Matt’s blog post. Video transcript [00:00:00] Matt: Hello and welcome to today’s episode of Behind the Cloud. We’re going to try and answer the question, what are the GCP tools of the marketing analyt
The GCP is vast and overwhelming. We aim to sort through the clutter and help highlight what you need to know as a digital marketing professional.
This week Dan and Dara are joined by Johan van de Werken, creator of GA4BigQuery.com. They chat about how GA4BigQuery started and where it's going, how Johan moved from letters to numbers.
There’s no one-size-fits-all approach to building an analytics team, but there are four types of people you should look for to advance your in-house analytics capabilities.