Behind the Cloud: What GCP tools should you be familiar with?

In this episode of Behind the Cloud, Matt aims to answer the question,what are the Google Cloud Platform (GCP) tools of the marketing analytics trade? And more specifically, what are the tools that you should care about in Google Cloud Platform. For a more in-depth write up on GCP tools, check out Matt’s blog post.

Watch below 👇 or head over to our YouTube channel

Transcript

[00:00:00] Matt: Hello and welcome to today’s episode of Behind the Cloud. We’re going to try and answer the question, what are the GCP tools of the marketing analytics trade? Or to put it in plain English, what are the tools that you should care about in Google Cloud Platform as a marketing analytics professional?

[00:00:20] Matt: Google Cloud Platform is a vast universe of over 100 services catering to various industries. It’s incredibly powerful, but it can also be very overwhelming, especially for those of you new to the cloud computing world. So let’s delve into some of the key GCP tools that you should know about.

BigQuery 

[00:00:37] Matt: BigQuery: This is GCP’s flagship serverless data warehouse, perfect for storing, joining and analysing large data sets. It’s sure to be where you’re going to spend the majority of your time. You’ll need to have some SQL knowledge, but many generative AI features are coming online to lower the barrier to entry in that regard.

[00:00:54] Matt: Overall, BigQuery simplifies data management and scales with your needs. You’ve just got to make sure that you follow best practices and watch out for costs associated with data processing and managing large data sets. 

DataForm 

[00:01:06] Matt: DataForm: This tool enables scalable and reliable SQL transformations, integrating software engineering practices like version control and testing into data workflows all now built into the BigQuery UI. You’ll need familiarity with systems like GitHub and coding languages like JavaScript and SQL, but once you start to scale up a little bit, it’s a good tool and a set of principles and practices to adopt. 

Cloud Functions 

[00:01:29] Matt: Cloud Functions: Classed as something called function as a service. This is a serverless solution, which allows you to run backend code in response to events without having to manage servers. It’s great for data processing, for pipelining and for automation, you should know its limitations like execution time and memory constraints. You’ll also need to know how to code, either JavaScript or Python. 

Cloud Scheduler

[00:01:53] Matt: Cloud Scheduler: Think of this as a cloud based task scheduler. It’s user friendly, especially if you’ve ever used cron jobs before, but it is very simple either way. You can automate tasks like triggering cloud functions or workflows to start kicking off a cascade of actions within the Google Cloud Platform. 

Cloud Composer

[00:02:11] Matt: Cloud Composer: This is a managed orchestration service built on something called Apache Airflow. It’s great for scheduling and monitoring data processing workflows, but it comes with a steep learning curve and potentially higher costs. It’s definitely one for more mature warehouse configurations. 

Cloud Storage

[00:02:27] Matt: Cloud Storage: An intuitive storage solution, ideal for various data types, and it’s a great temporary staging ground for further data processing. It also has loads of options for longer term archival type data storage to help control costs and keep data long term. 

PubSub

[00:02:44] Matt: PubSub: A messaging service that connects applications for real time information flow. It’s really simple and scalable, but managing message duplication and subscriptions can get a little bit tricky. Here we’re really starting to delve into the realms of real time data processing, which may be beyond the pale for many, but it also has roles in triggering all the various services within the GCP via things like cloud functions. 

Cloud Workflows

[00:03:09] Matt: Cloud Workflows: This tool automates and orchestrates GCP tasks, combining services into one unified workflow. It requires an understanding of workflow orchestration and it needs to have careful thought around error handling. 

[00:03:22] Matt: One example of its use may be triggering a cloud function to extract and store data, and then triggering a data form execution to transform and build reporting tables. It does have limits around memory and processing, but this could be a gateway to more sophisticated tools like Cloud Composer. 

Dataflow

[00:03:39] Matt: Dataflow: Offers real time and batch data processing capabilities. It’s ideal for processing streams of real time data, but it requires a solid understanding of its system and coding for efficient use. It’s often used in conjunction with PubSub and a database endpoint like BigQuery. 

Dataprep

[00:03:57] Matt: Dataprep: Simplifies data preparation phase with an intuitive visual interface. Very user friendly, but it might be limiting for some specific complex transformations. 

Dataplex

[00:04:08] Matt: Dataplex: A solution for managing data across data lakes and warehouses. It requires knowledge of data governance for its effective use but it can be a really useful and powerful way of democratising data and discovery across an organisation. 

Vertex AI

[00:04:21] Matt: Vertex AI: An end to end platform for machine learning, suitable for both novices and experienced practitioners. It simplifies the building, training and deploying of machine learning models but it can be costly.

[00:04:34] Matt: It still needs work to be done on feature engineering etc before models are trained and you’ve got to watch out for costs. Remember this isn’t a complete list. There are other GCP services, but these are the key ones for marketing analytics, in my opinion. As you explore these tools, think about your end goal, whether it’s a dashboard, a chart, or a report, and choose the tools that offer the most efficient, scalable, and secure way to achieve that objective.

[00:05:01] Matt: Thanks for watching. If you have any questions or need further insights, feel free to reach out to us at Measurelab. We are a Google Cloud Service Partner and a Google Marketing Platform Partner, so we can answer questions across all of the Google stack. If you haven’t done it already, please subscribe to the YouTube channel to get updates when we release more of these Behind the Cloud videos in the future. Thanks very much.

Share:
Written by

Matthew is the Engineering Lead at Measurelab and loves solving complex problems with code, cloud technology and data. Outside of analytics, he enjoys playing computer games, woodworking and spending time with his young family.

Subscribe to our newsletter: