#99 AI in Self-Serve Analytics (with Gaurav Tiwari @ Spotify)

Written by Will Hayes | February 23, 2024

The Measure Pod

00:00 / 52:09

In this week’s episode of The Measure Pod we spoke with Gaurav Tiwari, ex-Meta, currently working as an Engineering Manager at Spotify. We spoke about AI not just in data analysis but also its use when it comes to data preparation, semantic layering and analytics engineering. We also have a big announcement to make regarding one of our co-hosts, so stay tuned!

Show note links:

Find Gaurav on LinkedIn
Reach him via email gaurav@metalytics.uk
Spoke at London Analytics meetup on 29th Nov 2023 on “Empowerment through Data: The Evolution and Future of Self-Service Analytics”

🎥 The podcast is now available in vodcast (video) format! Watch the episode below, or over on YouTube.

Let us know what you think and fill out the Feedback Form, or email podcast@measurelab.co.uk to drop Dan, Dara and Bhav a message directly.

Follow Measurelab on LinkedIn and on Twitter/X, and join the CRAP Talks Slack community.

Find out when the next CRAP Talks event is happening on LinkedIn.

Music composed by Confidential – check out their lo-fi beats on Spotify.

Master Google Analytics 4 with Daniel Perry-Reed on the next GA4 Immersion 6-week cohort training course. Charity and group discounts available!

Quotes of the Episode:

“…The part that never dies is the human curiosity to know more about something.” – Gaurav
“…I would love to have a little bit more visibility around the social impact that people are making by saying that, yeah, we are contributing to making the world a better place.” – Gaurav

Transcript

*Please note the following transcript is AI generated.

Intro | Topic | Rapid fire

Intro

[00:00:00] Dara: So we just wrapped up our chat with today’s guest, Gaurav Tavari. A really, really great conversation talking about wide ranging as, as it often does. It went a little bit philosophical as well but specifically talking about things like the semantic layer, really good chat, what did you guys think?

[00:00:32] Bhav: I loved this episode. I think it’s only because it touches on a topic that’s close to my heart, something we’ve discussed previously in the past, which is self serve. And I think what Gaurav did was, help take it to the next level, that conversation, which is about using AI and generative AI and large language models to make self serve a reality and how we might go about that.

[00:00:55] Bhav: And as you mentioned, Dara, there was an element of philosophy that went into it, which I really like. I think some of these topics, it’s hard to have them at a purely intellectual and factual level. I think you need to start discussing some of the philosophical elements of these topics. And I don’t know, I really liked him and I think he’s someone I’d love to bring on, put in front of the CRAP Talks audience and give him a voice there because I think he’d be, he’d be fantastic.

[00:01:23] Dan: I really liked the way that we went as well, kind of, slightly different direction, but equally valuable. I think that not a lot of people talk about is that kind of going into the, where the hell do you start and what actual people and teams do you need to be able to utilise generative AI, semantic and reporting layers, and kind of talking about data democracy and self serve analytics as a whole, because it’s so easy to go straight into the kind of like the kind of future vision of stuff, but then also not considering things like the philosophy that you mentioned, but also like, how the hell do you even start this? Like it, sometimes it assumes you have a, a, a sort of a team of data and data scientists of like 50 people to be able to do the kind of stuff that theoretically we talk about.

[00:02:02] Dan: So grounding it that way, I think was really useful as well. Just, yeah, just, it’s just such a fascinating, interesting episode. So, be sure to stick through to the end when we hear the not so rapid fire questions. And, and thanks to Gu I think we’re gonna have to start renaming these, because, I suppose in hindsight they’re not really rapid fire when we’re asking kind of like these very big, grandiose questions.

[00:02:24] Dan: But a couple of plugs quickly. So remember to check the show notes for all the links that we talk about in today’s episode. also find links to join the Slack community for crap Talks. Where am I? Bhav, Dara are over there kind of chatting away and get involved and ask us stuff over there as well as loads of other people in all sorts of different fields around analytics.

[00:02:41] Dan: We’re on video. So if you’re not aware already, or haven’t checked us out, check out my amazing Star Trek Christmas jumper as we record this one, I managed to show flash on the end and yeah, and just, just a reminder, if you jumped in on this episode and you’ve not heard any more. This is Dara’s penultimate episode with us as a kind of main co-host.

[00:02:59] Dan: We’re at the end of this series. We have one more episode left of this series. It’s going to be a special episode. It’s the hundredth episode and it’s Dara’s final episode. So we’ve got something special in store for him, and for, for everyone else as well. So be sure to check us out next week and to see all the goodies we’ve got in store for that.

[00:03:16] Dan: All right. Well, plug’s over. Thank you so much for checking us out. Stick through, enjoy the episode and we’ll see you over on the. Slack community.

[00:03:27] Dara: Hello and welcome back to the MeasurePod. we’re delighted to be joined today by Gaurav Tiwari. so Gaurav, firstly, big warm welcome to The Measure Pod and thank you for agreeing to come and chat to Bhav, Dan, and myself.

[00:03:40] Gaurav: Thank you very much. It’s really nice to be here and I’m really looking forward to it.

Topic

[00:03:45] Dara: So we always say this, Gaurav, rather than me doing a really bad job of introducing you and getting everything wrong, we’ll turn it over to you. So in as much or as little detail as you like, maybe give us a little whistle stop tour of your background and what you’re up to these days.

[00:03:59] Gaurav: Yeah, absolutely. So, my name is Gaurav and I have been working in the field of data analytics engineering for about 10 years now. Bye. This has been a wild journey that has led me to spend my time across multiple countries, multiple places, and I’ve been fortunate enough to do that. And, as of recently, now I work at Spotify as an engineering manager, which is.

[00:04:26] Gaurav: Somewhat a bit of a different place than data specifically, but does revolve around it as I lead, teams which are very much focused on the back end side of things or the front end side of things over the last two and a half years or so, but other than that, the career, the trajectory that I have been on has started off from being, a data engineer myself and then being a data analyst, leading data teams, setting up structures, setting up architectures and whatnot.

[00:04:53] Gaurav: Right. So that’s something. Okay. I’ve been focusing on for the last 10 years, as I said, and yeah, that’s, that’s a bit about myself around, what I’ve worked on.

[00:05:04] Dan: Awesome. Well, Gaurav, I, I, managed to, I think, corner you, or one of my colleagues cornered you at an event, the other week or at least at the time of recording.

[00:05:13] Dan: So you did an amazing talk, presentation at the London analytics meetup back at the end of November, 2023. And, yeah, do you know what? It was one of those presentations. Then I think you even said the point that if you had more questions at the end of it than when you started, then you’ve done your job effectively.

[00:05:30] Dan: Basically, it’s been a good presentation and I have a thousand questions and I’m not sure I am any clearer about a lot of them, but I feel like I learned a lot at the same time. So I’m feeling confused. So how about you give us a quick, whistle stop tour, around that presentation. What, what you spoke about and maybe why I’m scratching my head here.

[00:05:50] Gaurav: Yeah, absolutely. So, it was one of those things which I’ve been thinking about for quite some time. And during the downtime between work, I’ve been thinking about the notion of self-serve analytics, which is what we call for users where non technical users can basically explore data, build analysis, make insights, and take action on them.

[00:06:12] Gaurav: is evolving as we go now. So there are quite a few players out there who are trying to solve that problem. And since we are living in this place where you can’t speak a single sentence without introducing the word AI in it. So I thought about what that means for us going forward, right? And what would a world look like if we talk about empowering people?

[00:06:34] Gaurav: with a I to be able to make those kinds of decisions or do self-serve analytics, right? And I talk about a little bit more on how I throughout my 10 years working in this field, how I’ve seen evolution in terms of the different ways people have done analytics so far, and it started off from the very basics.

[00:06:57] Gaurav: Things where someone builds up a dashboard for you and basically looks at that to the part where we had some interactivity where someone started like, Oh, you know what? You can do some filtering as well. So now you don’t have to build a dashboard for every single country to say. So you can drop it all up in one go to the point where we had web based things.

[00:07:13] Gaurav: So analytics tools, which essentially helped you to create your own, small visualisation or small dashboard by yourself. Right? And. Yeah. Then where we are, that’s, that’s kind of the point where I see where we are. And I started realising throughout my time that it’s good. It’s definitely much better than 10, 15 years ago where we were, but.

[00:07:37] Gaurav: Is that still the way to go? Is that how, is that the way that people see people using analytics going forward? And I felt from my experience, from me working and setting up analytics platforms for different companies, I felt that that’s really not the, not the solution and generally speaking, if you’re speaking about self-serve analytics as a way to empower people, are people actually being empowered right now?

[00:08:03] Gaurav: Because there is data, there is data and studies that suggest otherwise. So I, something like 30 percent of the total population in a company are actually empowered to do that. So that’s one in three. If self-serve is so beneficial, why is there still a one in three chance that people are not able to use it, right?

[00:08:23] Gaurav: And what I went, what I said, something which I think resonated very well with the audience there was if it takes you three levels of certifications to say that, yeah, I’m a self, I’m an expert in the self serve analytics tool. It’s not really self-serve now, is it, right? Because you need some sort of, like, education to be able to do that.

[00:08:39] Gaurav: And this has not just impact on the side of, end users, but also from a data, team point of view, it is quite a big amount of overhead, because when users can’t use self-serve analytics tools, who is gonna help them? It’s the data team, right? And on an average, studies show that, It takes on an average around four and a half days for a data analyst to build up a dashboard, which is quite significant when you talk about you’re using self-serve analytics too, right? And the same study suggests that almost 50 percent of your data team’s budget. Is wasted on those ad hoc tasks, which is still quite significant, right?

[00:09:21] Bhav: Yeah, and I think, I think one of the big challenges is that, especially when we think about the current data landscape, there is this, I think, a misinterpretation of the term self serve.

[00:09:31] Bhav: I think a lot of people mean, think self serve means, hey, here is a suite of dashboards. That you are now self serve, but actually self serve doesn’t mean that at all. Self serve analytics is more about empowering users to go deeper and explore those dashboards. So the dashboard might be a starting point, but it is not, it is by no means the end point.

[00:09:52] Bhav: And, I, I think if you, I don’t know if you said, you know, you listened to the episode, but self serve is a, is a, a passion project of mine for many years, and it’s been also the biggest failure of my life. So I’m keen to understand why you feel like this is something worth tackling.

[00:10:08] Gaurav: I’ve done implementation around self-serve analytics across different roles. So one of the roles which I worked as a manager for the BI team, we introduced Tableau as a self-serve analytics tool and basically Went from not having anything to a point where almost 600 people were using that tool all together. And, there were quite a lot of learnings from there. We, we, we, we set up a lot of architecture infrastructure to be able for users to be empowered to do that.

[00:10:36] Gaurav: But yet, even after all of that, it’s just, as you mentioned about the failure part, we still spent a good amount of time explaining to users how to use self-serve analytics. We spent a good amount of time helping them like, oh, this dashboard. I was trying to make this, but instead it did that. How do I fix this?

[00:10:55] Gaurav: Right and it’s like you might as well just do it yourself. Right. So I think that’s why I feel like this is a problem worth thinking about because every organisation that I’ve been to asks the same question. Oh, yeah, this is data, but how do I use it now? Like, yeah, you have Tableau, but what? Like how? Do I drag and drop?

[00:11:14] Gaurav: Yes but if you drag and drop a measure, it becomes an average or a sum. What am I, what should I be using? That’s the part that I feel like it needs to be tackled.

[00:11:22] Bhav: Yeah but 600 people is successful, that seems successful to me, right? It’s 600 people in what, I mean, what size is the organisation? 600 users. If it’s 600 out of seven, 800 people, that is a successful self serve program, right?

[00:11:36] Gaurav: Yeah, it was about 1200 people. So that’s still 50 percent adoption. But I’m quite proud of that achievement. To be honest, it was at a time where self-serve analytics was, it definitely was not like that at the very early stages, but it was also at a stage where there were not a lot of mature products of mature implementation out there.

[00:11:56] Gaurav: So I spoke at various types of conferences or Amazon AWS conferences about showing how we could do that. Implemented that, which is really good. But then as you kind of move from that stage to the next stage of evolution, you realise that there are still some drawbacks that you need to kind of look into.

[00:12:12] Dan: Is this, is this then like data literacy? I’m just trying to think about how, like, you know, you can provide self-serve access to stuff. And then it’s like, are they doing an. Average or a sum or a, or a kind of stupid thing with numbers. I think that self serve is like, I feel like it’s trying to tackle too much at once. It’s trying to be like, okay, it’s data storytelling.

[00:12:29] Dan: It’s data visualisation. It’s accessing databases. It’s data literacy at the, at the more foundational level. Like, is there, is there such thing as, as, as a self serve project or is there more kind of like, you know, we need to make sure there’s a base level of like maturity or literacy at a level where you can trust people to not to mess up a sum calculation or something like that.

[00:12:48] Gaurav: You mentioned quite well about data literacy part, because that’s something that I was always thinking about, like, self serve analytics is supposed to benefit the non technical audience, let’s say someone in sales, someone in marketing, or someone in operations who is Not very much interested in writing SQL queries, but would rather just use the dashboard and create their own things.

[00:13:10] Gaurav: But what I’ve observed is there is only a certain amount of threshold until someone is willing to get their hands dirty on these kinds of things. After a certain point in time, that’s not what someone joined or that’s not their specialty, right? On the contrary, if you were a data analyst and you were told like, you know what, you’re going to spend.

[00:13:30] Gaurav: The next 20 days or 10 days learning how to do sales reports, learning how to do sales, talking to people. You might be interested in it, but is that something that you’d want to get super into? You know, probably not, right? So that’s where I think providing those trainings, providing that literacy about how to use a tool only goes to a certain extent, because at the end of the day, as a non technical audience, you have 100 different things to prioritise, and learning how to be more data savvy is not very high on your agenda.

[00:14:00] Bhav: And I think we talked about, we touched on this, didn’t we Dan, and it came down to the motivation levels of someone wanting to pull their own data or needing to pull their own data. Of course, if there is a bottleneck within the data team, you know, you kind of find yourself in a situation where you maybe don’t have a choice, but to pull your own data.

[00:14:18] Bhav: But if the choice, if the option Is there to use the, you know, to, to lean on an analyst or a data scientist or something, you know, like why would someone wanna do it themselves? And, and, and this is, I think this, this motivation thing has been, I think maybe at the heart of why selfer programs across the industry have maybe not been as successful.

[00:14:38] Bhav: Because you are right, there are so many layers of it. I, I do, I wanna move into sort of like the future, but I wanna, I, I love, I just wanted to like, touch on the fact that, I love the fact that you’ve recognized, and you mentioned it very early on, that actually, bi. Programs and software programs have been on this evolutionary journey from, you know, first building a dashboard, then allowing filterability and, you know, dates and segments. And then it’s kind of evolved, but it’s still not quite the, you know, where it should be. So, do you think it can be successful anywhere?

[00:15:10] Gaurav: I think it very much depends on the environment. So with the current state where we are, it’s definitely higher than what it used to be. But when we talk about success in saying.

[00:15:20] Gaurav: Everyone is fully aware about what data is, where it resides, how can I make it actionable and make decisions on it? I think there’s a long way to go, to be extremely honest. I think there is still that aspect of barrier which leads people to start getting into that world of data. I remember like my first job, anything that was not in an Excel file, people wouldn’t touch it, right?

[00:15:44] Gaurav: Because it’s like, I can’t, I, I, can you just? That’s the first thing that a lot of people ask, right? Can it export it to Excel or not, right? So I remember and I work as a consultant In addition and I see so many organisations where they are still in the same stages. Yes, of course, you can talk about big Organisations that are like really good startups or something where they have evolved to that maturity level, but still at a very rudimentary level, there are still people who can’t process anything other than Excel.

[00:16:14] Gaurav: Right. And that’s purely because it gives you that ability to move around and manually. edit things, right? So talking about success, I think there’s still a long way to go before we get to a point where we say people are actually able to act on that information immediately to say so.

[00:16:29] Bhav: What do you think would be a good next step? Let’s say, you know, the utopian world of self-serve analytics. covers everything that Dan touched on. So, data literacy, visualisation, storytelling, you know, all of those things. let’s say that, that is the utopian state of self serve. What would be a fundamental next step that organisations should be looking to achieve if, you know, as, as, you know, to get to that, onto that journey to, to self serve success?

[00:17:01] Gaurav: Yeah, that was exactly the question that I asked myself when I was thinking about, like, where do we go from here? Right. And in an ideal world, what, when you look at a job description for a data analyst or someone who wants to kind of help out, it’s like you want to empower users with data so they can take quick actions and basically don’t have to wait three days to get that result.

[00:17:24] Gaurav: Right. That’s the utopian state in my mind. Right. And, and, and when I talk about, think about that state, I think about, Barriers that are causing us to not get there and the most fundamental thing that I think is. Is a language barrier, I think the two parties on different sides of things are not speaking the same language.

[00:17:46] Gaurav: Data person is speaking more like a sequel or database ish language, whereas the person on the other side is more interested in the business outcome of that data. Right? So I’m interested in making sure that all of the data persons and making sure that all of my records have the right data quality.

[00:18:05] Gaurav: I can sum it up properly. There are no null errors and whatnot. Whereas the other side is. purely interested in what’s the business value out of this information, right? So I think that, in my opinion, when it comes to self-serve analytics, or generally analytics, that’s the language barrier that I see needs to be brought down.

[00:18:21] Gaurav: And when I think about solutions to that, I do see that there is A good role where generative AI can, can, can play this particular role to bridge that gap, right? So it’s not a new, problem space. People have been trying to solve this problem for quite some time, but with the advent of the fact that anyone can spin up an API on open AI and basically ask models to create code.

[00:18:52] Gaurav: Which is being used readily, it can be a way to bridge that gap much faster, maybe. So that’s where I’m thinking about going forward, yeah.

[00:19:01] Bhav: Isn’t that maybe two or three steps down the line, for example, in order to ask the AI, Generative AI, you know, whatever, to write those, write those APIs or write the SQL query, there needs to be a fundamental understanding of what the data looks like and what’s available and how it connects to each other.

[00:19:18] Bhav: So, and I know, and I think you’re probably going to touch on it, but in terms of, the semantic layer or presentation layer, or you know, whatever, however you term it. I don’t even know what the difference is, if there is a difference between those two terminologies, but I’ve always gone with the semantic layer, but there’s also the presentation layer.

[00:19:33] Bhav: So like, would that not be a good first next step, which is that the, I don’t say the creator of the data, because the data can be anyone. I guess the people who process that data, it, they take it upon themselves to ensure that there is this present, this data layer, this semantic layer, You know, presentation or whatever completely matches business terminology to touch on what you were just talking

[00:19:59] Bhav: about.

[00:20:00] Gaurav: I understand and I agree that the semantic layer is the place to go. And I think, there’s two things that I think about. The first one is In most of the use cases, as a data analyst, you would try to streamline that piece of information that you would like to show in a dashboard already in some sort of a layer, call it semantic, call it presentation.

[00:20:20] Gaurav: That’s absolutely up to you, right? And try to add in the second point, what you mentioned about business terminology, so that it makes a lot more sense, right? But often and then not, this experience is never to a level where a business user can simply look at that data and basically say that, yeah. I understand everything that’s happening there.

[00:20:39] Gaurav: So I do see that semantic layer is a fundamental aspect of making sure that your user, your AI bot, or whatever that we can talk to, can basically use that as a way to get information, right? Because if we just simply say, let’s say you are a new analyst, and I tell you that this is the raw data from 300 different sources, Get me this report.

[00:21:07] Gaurav: I’m quite certain that’s not a very good place to be in. Right. So that’s where I see that semantic layer is basically a fundamental requirement for us to even get to the point where we want to go forward. Right. And that’s what’s happening today as well.

[00:21:22] Dan: Is the semantic layer or the presentation layer, just the new dashboard in the world of self serve. I’m just thinking it’s a, it’s a piece of tech sitting in between the end user and the analyst or the data team that has to be maintained and updated and contextualised for them to be able to self-serve. And I’m just, and, and in the power. Or the magic of editing and scheduling, we should already know everything about, semantic layers with our guest, David, who would have come out last week, but we have yet to record.

[00:21:50] Dan: So we have no, at the time of recording, we have no knowledge of semantic layers, officially. but the thing about all this kind of stuff is like the way that you explained a dashboard, right, at the very beginning, it’s like, okay, it needs to go through this four day process on average to be set up by.

[00:22:04] Dan: a data person, and then it’s then, then the self serve can start, or to some extent they can start accessing the data now we’re saying there’s magical stuff to do with generative clever stuff behind the scenes, but it relies on this other piece of software, which is a layer of context, a presentational layer, which also has to be set up by the data team and has to be maintained by them.

[00:22:23] Dan: And it feels like we’re replacing one for another. What are your thoughts on that? Is that the case? Or am I kind of getting the wrong end of the stick with us?

[00:22:30] Gaurav: I would say it’s, it’s not entirely, exactly where we said we were replacing one thing with another. That’s one thing that I would say because There are different ways, and you can put a semantic presentation layer.

[00:22:43] Gaurav: Everyone calls it very differently before it became a term for semantic layer. People used to call it a reporting layer. There were cubes that were created in the past, which were specified for a particular place, right? We used to use the term cubes quite a while back, which was like, specifically for this particular market department, right?

[00:23:00] Gaurav: So I don’t think we are quiet. Far away from this. There have been rapid changes, massive changes in how we present it, how we work through it, but the fundamental aspect at the very bare bones remains still the same of all the data sets that we have. How can we bring it down to one summarised, cleaned, aggregated way that can be used by our users?

[00:23:22] Gaurav: Be it writing SQL queries, be it funnelling into a dashboard, be it making it a self-serve basis, right? What the concept semantic layer has offered is making it much easier, in my opinion, to not parse through those underlying 300 sets, but actually just look at this layer and that’s your business layer to say so, right?

[00:23:41] Gaurav: so you’re not ideally replacing anything or adding new things, right? It’s been there. Since the beginning is just different in different maturity stages to say so, the part where I see a bit of a challenge and that’s probably will change is how people use it, how people use that semantic layer is different now or can change with generated AI.

[00:24:04] Gaurav: So, usage of tools like Tableau or Looker or anything for that sake does require a specific set of knowledge. That takes time to build up and takes time to have the confidence to say that I have built this dashboard and I know with a very high degree of certainty that it’s the right way to build it that takes skills and that takes experience to build that right and that’s the barrier that I think needs to be lowered.

[00:24:32] Gaurav: For more users to adopt using the semantic layer all together. So, that’s where I think they see the evolution of the future going in now.

[00:24:42] Bhav: And how do you think AI is going to help lower that barrier?

[00:24:46] Gaurav: Yeah, I think we have been seeing a lot of use cases where developers, for example, in my team or across are using copilot as a way to basically help them with assisting in creating code, right?

[00:25:02] Gaurav: We also have strong improvements in models for natural language processing, which have seen good benefits in which they can. Parse a natural language, and that’s when I say natural language, that’s the common people language, right, that I can write as a technical person or a non technical person, like, and basically convert that into a code based knowledge or code based language, like a SQL or a Python code or whichever you would like to do, right?

[00:25:32] Gaurav: So having that combination of LLM or these large language models to be able to. Converting let’s say a simple natural language to a code that can be used to run analysis that can be used to run dashboards is probably where I see AI is kind of helping out to to basically, bring that better down. So if you talk about an example of you being a non technical person, it has No idea, let’s say about a sequel or any kind of thing.

[00:26:07] Gaurav: All you’re interested in knowing, like what is not a financial performance for a particular state over the last 12 months. That’s the question you have, you don’t really care where that information comes from. You’re more interested in getting that number in a chart, in a table, or whichever format.

[00:26:23] Gaurav: That’s all you need to know, or that’s all you should know. You shouldn’t be going into logging into a particular portal, learning how to use that tool, which data sets do I want to use, how do I drag this to make a chart, and things like that, right? So, that’s where I see the future slowly starting to get towards.

[00:26:41] Dara: Is there a risk? So, so you’re, I take the point there that you don’t want people having to work really hard to get to a simple answer to a question that they have. But is there, is there some risk? Let’s say that’s on a spectrum. Is there a risk that if AI makes it too easy, that it actually takes away the motivation or need for the person to answer the question? And then use their own kind of curiosity to better understand the data and where it’s coming from.

[00:27:06] Dara: Because sometimes you don’t want to, you don’t want to have to go around the houses, but sometimes you learn by going through that process. You learn some of the nuances in the data. You learn some of the context that maybe there’s a risk that if AI is doing too much of that, that people will be trusting the AI and then making incorrect decisions based on that.

[00:27:25] Gaurav: Yeah, I think you’re right on that front that there is a certain amount of risk associated with certain things. But, when I think about it, from a decision making perspective, I also think about how many decisions are made right now without even doing any kind of analysis. Right? So if you think about going back to that 4, 5 days period that I was talking about, that time it takes if you’re a time crunch on a piece of decision as a marketing manager, you have to make. And it takes four and a half days for you to get the result, whether this is a place where I should spend 100,000 or 1,000,000, in all probability, you’re not going to wait.

[00:28:05] Gaurav: If you’re in a time crunch position, you will end up making decisions. Yeah, I think that’s the way it should be. And let’s, let’s just go for it. Right. And I’ve seen that happen more than you would think. so I still think about having. Being curious to answer those, get answers to that question and being a bit more rational around whether this makes sense or not is important.

[00:28:25] Gaurav: And I think right now, marketing is just continuing on the example of a marketing manager, they are still doing that. Like back in the days when people were looking at dashboards, they were still looking at those numbers and trying to make inferences and understand whether this makes sense or not. What we are offering or what I’m thinking about the future, it just enables you to do more.

[00:28:45] Gaurav: Without having to get stuck in places. Right. And it’s like the problem remains the same. How, how much in depth can you go is what everyone is trying to solve, right? How much can you know about that problem in a short span of time? So I do see there are some risks associated with that. But I also see that in this time, when time is probably the most important commodity that you have, Not enabling or not using these, assistive tools will probably have bigger risks to say so.

[00:29:21] Dan: I, in the magic of time travel, I would have already, if I remember to ask David this in the last episode. but I want to ask you, now for the second slash first time. so the, the, the kind of like natural evolution of the technology that we work in, so just take things like, ETL, right, evolved into ELT because processing and storage costs got so much cheaper that it’s just throwing it in.

[00:29:45] Dan: We’ll figure it out afterwards and then we’ll do the transformation once we have it in our own sense. so, place, then, then you kind of go into the world of things like data meshes and kind of connective cross cloud platforms, where you can just, if I access the data, I actually don’t need to keep a copy of it.

[00:29:59] Dan: I actually don’t need to have my own duplicated data set there. And it became accessible. And I’m just wondering with the evolution that that’s going at the cost of large language models and generative AI right now, if that continues to come down, then would there eventually be a need for a. semantic layer or reporting layer in between to provide the clarity in the context when it can crunch 300 different data sets.

[00:30:22] Dan: You can pre prompt it through a thing like a custom GPT with some contextual stuff and you’re prompting it, the, the, the, the semantic layer becomes prompts. and I’m just wondering if you see it going down that path and then it truly becomes kind of throw it in and access it galore, right? Everyone can access everything.

[00:30:38] Gaurav: I’m smiling a bit because I literally tried doing that last week. So I spent a lot of time, in my spare time, kind of trying it out. Like what happens? So I created a data set with, I think I remember about 235 or something different tables and basically just pointed to a GPT model to answer a question, which probably could have been solved with just two tables.

[00:30:59] Gaurav: Like, let’s throw it away and see what happens, right? it’s not as accurate. And that’s something that I was thinking about as well, that it’s not going to be the place where. you would get exactly the most accurate result because there’s just a lot of information to pass through and within the time that we wait for a GPT or the model to give you an answer, they will make some decisions to basically say, this is the data set that I’m going to use to answer those questions right?

[00:31:27] Gaurav: So my answer to that question is, I don’t know, it is possible. But in the current context, providing the right context to your model is very important. And that’s, that’s, that’s the key to answer or to get the right answers. Right. And when it comes to data, providing a semantic layer is basically providing the right context. In my opinion, right now, if you don’t do that, the output will be tricky.

[00:31:55] Gaurav: Output will be far from what you’re expecting. And, Yeah, maybe in future we become, we come to a point where you don’t really need the whole idea of semantic layer, but knowing from where we are, it is still a very much of a necessity to provide a good context and semantic layer does a very good presentation there, whatever you call it, does a pretty good job in providing that context.

[00:32:18] Gaurav: So when it’s going back to that example, it’s trying 35 to 38 model tables to GPT versus throwing those three. tables to GPT, the output was significantly different, faster and more accurate to say so, right. So I do think that in the current stage, it is important.

[00:32:37] Dan: Okay so a lot of our audience are, I suppose, we fly the flags for marketing analytics and product analytics, at the moment. So, within that kind of context, and I’m not kind of pigeonholing it all of our audience into those two categories. If someone likes it, that sounds great. I’ve heard this word semantic layer and I like the idea of giving a chat bot to my, my staff or my team or whatever. Where, where the hell do you start?

[00:33:02] Dan: Like if you’re in that kind of company that doesn’t have. A team of data scientists or, or analysts, maybe there’s a small team of people that are dedicated to this and they’re just like, this is an objective for 2024, where do you start? What kind of tech do you find you have success with, or you think is more accessible right now? And I know that we’re doing a slice in time on the day of recording this, but like, where do people start?

[00:33:24] Gaurav: It’s very easy to go ahead to ChatGPT on a web based browser, drop a file and basically do data analysis to some extent. So if that’s also a use case, more power to you and basically do that. But from a bit more scalability perspective, you would need to think about either leveraging those APIs yourself or There are so many different providers which are starting to come up on the market now on the scale, which offer something similar, right?

[00:33:55] Gaurav: So, you can basically leverage those. Tools and and basically those stacks to connect to your models, connect to your databases or connect to your tools itself. For example, you said that if you are in a position where you don’t have a data warehouse, you don’t follow E. L. T. Process. You simply connect to your marketing platforms and basically use their built in analysis to do that.

[00:34:22] Gaurav: There are tools out there, which now are popping up and outside their beta versions, which can help you. In doing that stuff as well to integrate those things, build those semantic layers that we talked about and help you answer those questions as well. Right? So my suggestion would be to kind of start looking around because the space is evolving rapidly as we speak every day.

[00:34:45] Bhav: My only, I guess, drawback and thing that I’d be most pessimistic about is that the introduction of a new tool or technology in itself in any organisation is a barrier, right? You have to go through, you have to get a budget and sign off, you have to go through seats and how many seats and who gets allocated a seat.

[00:35:02] Bhav: And once you’ve, once you’ve got that, you then need to find a way to connect that, that GPT or that technology to your database. And someone needs to do that. And then that becomes an engineering problem still. And then if the engineering team hasn’t built a good enough semantic layer that they would trust it.

[00:35:19] Bhav: I mean, you said that yourself, you threw 300 tables at a custom GPT model. And you know, it wasn’t reliable, but if you had then, if you had the capacity to do the engineering work too. compress that down to 10 tables, you know, that’s, that’s still going to be a problem that most companies face is that they’re going to have to do the engineering work in the background to be able to get that connected and worked. Like it just, it just, it doesn’t feel like an easy next step.

[00:35:47] Gaurav: Yeah, I agree. It’s not like a simple plug and play kind of setup right away. But I won’t be surprised. And that’s where, when I said that we will end up having more questions going forward, I won’t be surprised that there are tools out there, which are trying to help you out.

[00:36:03] Gaurav: For example, I’ve been following a few tools and they are investing heavily, not in improving the AI model to generate answers, but solving that aspect that you mentioned about how we can make sure that they are easier to plug and play? How can they be used? AI itself builds a semantic layer so that you don’t really have to do that kind of plumbing to get to a point where AI can be used to get the answers, right? So I do see that evolution coming in already.

[00:36:36] Bhav: And the only reason I bring this up is because, over the years I’ve been burned many times by the marketing team, usually the typically the marketing team who have gone out, purchased some technology, without speaking to tech, without speaking to engineering, without speaking to the data team, and they’re like, hey guys, we’ve got this platform, and it’s to do reporting, it’s to do experimentation, it’s to do blah, blah, blah, but we need to feed it a whole bunch of data, and this is so much Shit in the backlog already that you’ve now got this new thing that some that one team is gonna That you want to integrate.

[00:37:11] Bhav: So I guess I wanted to vent for a second there just so you know. So I’m glad I got that off my chest. I feel much better But who should drive this? Right? Like this then becomes a million pounds question, like who should drive this, take that, you know, who should drive an organisation to take that first step towards that utopian, future that we talked about?

[00:37:34] Gaurav: Yeah, it’s an interesting question. I was thinking about it. And when I was putting together this presentation that I spoke on, I thought about this from both, Let’s take an example from a marketing perspective, which is a non technical audience to say so versus from even from a technical perspective from a data team for let’s say you’re a chief.

[00:37:55] Gaurav: Data or analytics officer at a place like what’s yours, what’s in store for you to be able to kind of take the first step here, right? So I thought a lot about as a data personnel or as a data analytics leader, what, what I, what’s the benefit that you get from it? And that’s where I started thinking that, amount of budget that you spend on ad hoc data tasks or something.

[00:38:18] Gaurav: It’s still quite high or the amount of time that your analyst or basically data engineers or data scientists are spending time to basically answer certain questions, which are one off like you ask this question once and never go back to it is still significantly high. So think about it as an investment.

[00:38:36] Gaurav: Into where the future goes at some point in time, 10 years back, you’d never had the concept of engineer science or analysts. They were just rubbed up into one person and slowly it started evolving. I remember my first role. I worked as a pricing analyst, which was nothing but a data scientist, but at that time, data science was not even a term to basically use that. Right.

[00:38:56] Gaurav: So I think it’s in that stage, it’s, it’s a step that benefits both from a marketing perspective as well as from a data analytics perspective. I see that if tech or data, to say so, takes the first step, it will be a lot more beneficial because they understand the stack and the usage of that stack much better.

[00:39:18] Gaurav: And I was laughing when you mentioned about a marketing team coming up and saying that this is the tool that we have highlighted. So many times has happened so many times and you end up redoing the entire pipeline to feed that tool. And that’s usually not the way it should be. It should be the other way around, right?

[00:39:34] Gaurav: So that’s that’s that’s my thought process around why be or someone in a data role holds that responsibility or view due to the breadth of you to be able to take these kinds of decisions and invest in that. Going forward.

[00:39:48] Dan: So, as we come to the end, unfortunately of our time, I just want to give one last opportunity, Gaurav. So, as we go back to the beginning, we probably all have more questions than we did when we went into this one, which is a success, because it’s a complicated thing and it’s not easy. if people wanted to ask you those questions or to, to get in touch with you in some way, can they, how can they, and where can they do that?

[00:40:12] Gaurav: Yeah, absolutely. So, It’s an evolving field, and I am learning and curious myself, so I’d love to have discussions with anyone who is interested. I’m available on LinkedIn, and also, as I said earlier on, run a data analytics consultancy by myself to help businesses scale up their data setup. Help with semantic and presentation layers.

[00:40:33] Gaurav: If you want to get into more of that. And of course, understand them, understand how I can help them because in a lot of cases, it’s just not needed right now. So being able to make that decision and say that that’s not the only thing to jump in. Feel free to connect with me on LinkedIn and I’ll be more than happy to help out.

[00:40:52] Dan: Awesome. We’ll put some links in the show notes.

[00:40:55] Bhav: I was going to say, I think we should get, go to come speak out a CRAP talks event that’ll give you, there’ll be loads of questions I think there, so, you know, it’s the first time I’ve Yeah, why not? Thought of, inviting a guest over to speak out at a CRAP talks event.

[00:41:07] Dan: But it’s the start, is it, it’s going the other way round rather than the CRAP talks coming on the pod.

[00:41:11] Bhav: Well, well this is the benefit of, of being a. having a podcast and a, and a meetup community so you can bounce between the two. add one final question before we move. I imagine we want to move into the rapid fire round. That’s our favourite round, which is why do people want to export to Excel?

[00:41:29] Dan: God, end on a, on that bombshell.

[00:41:32] Bhav: You mentioned that, you know, like, and I think you use, you know, at some point we’re going to be able to just go to, an AI tool and just say, show me the financial sales for last year, broken, you know, people will still want to export to Excel. Like, why do, why do people want to export to Excel? This is for anyone, by the way. It’s not necessarily directed.

[00:41:51] Gaurav: No, no. I, I, I’ve thought a lot about this as well, and I think, it’s more psychological than, than the ease of use. But I think people, when, when you, when you talk to someone, for example, I, I’ll give one real quick example.

[00:42:05] Gaurav: If you are sitting with someone and they’re operating the computer and you’re basically saying like, can you help me book a flight together? You would want to be as plugged into the screen as you want than the next person is actually doing that, right? Because. It’s a psychological thing that you feel you are a lot more in control of what you’re doing, as opposed to if someone else is doing that, that you don’t fully understand the way they work, right?

[00:42:26] Gaurav: So, I think when you have that data in Excel or Google Sheets of some sort, maybe Excel more than Google Sheets, because it’s still collaborative. in Excel, you feel that you have control to change it the way you would like to, and no one else is going to touch it, right? So I think there is something about it, that this is my workspace.

[00:42:48] Gaurav: I have full authority to make changes to this data rather than relying on a database or something that can change. So that’s my take on why that could be the case here.

[00:42:57] Bhav: So to summarise in that case, it’s basically even with advancements in generative AI and being able to plug and play, you know, So if you want to send, chatty type tools into your data warehouse, the end goal will still be to ask a question that exports it into Excel.

[00:43:14] Gaurav: I really hope it gets solved into a different interface.

[00:43:17] Dan: Maybe they just integrate Copilot into Excel. Maybe that’s it. Maybe that’s Microsoft’s plan. That’s it. It’s all done. That’s their front end.

[00:43:24] Dara: As much as it pains me to say it, I think Microsoft needs a lot of credit for this. Like they, they basically created that familiarity and that’s why everybody still thinks, Oh yeah, I can work with data.

[00:43:32] Dara: Give me a, give me an Excel spreadsheet and I’ll be comfortable with that. I’m really curious, Scott, just to ask you, I would say final, but I’ve got five more questions afterwards, which, which I’m going to ask you in quick succession. But my, my last question based on our conversation, we’ve obviously talked a lot about, this has all been about empowering business users and even the businesses themselves to understand and use data and make decisions based on it.

[00:43:58] Dara: I’m just curious to know what you think in terms of the wider world. Population, do you think the advances in technology, including things like AI, are making people more or less empowered to understand and use data to make decisions in their own kind of day to day lives?

[00:44:18] Gaurav: That’s a very good question that I don’t, I need to think, yeah, I need to think through that. But, I think, generally speaking, irrespective of where you are in the stage of maturity, of evolution, of data, analytics, whatever it is, the part that never dies is the human curiosity to know more about something. There was, and I gave this example in the presentation as well, that, I used to work, I used to study, what we call geospatial analytics to understand geographical positioning of things, in my, in my master’s, and we could use an open source tool to basically click one button and it would layer up different shapefiles together, and my, my professor said, like, what you just did in three seconds, 20 years ago, it took us three days to do something like that.

[00:45:06] Gaurav: Right. So, the level at which we are working and knowing and the way we are consuming information, if anything, it’s, it does make our job easier to make us more curious as we go, right? So thinking about AI, there are different ways in which you can do it. Like if you say that AI will be used to do all mathematical operations, and therefore kids do not have to learn how to do maths.

[00:45:30] Gaurav: I have a very different feeling about that, as opposed to saying that for business users, spending three days plumbing through the data would be a colossal waste of time, as opposed to using those three days to build better actions. Right? So I think that’s where the, the, the part that I go towards is there are benefits.

[00:45:49] Gaurav: There are, cons to, to, to, to you relying so much on AI to, to solve these kinds of problems. We just need to be, I think there was, there were some terms around, around responsibility as well. I’ve been following some of the things as well, but that, that can be risen, resonates well with me as well. So I think the curiosity for humans to know more about things or businesses to go more detailed, to understand every single bit will always remain the same.

[00:46:16] Gaurav: If we find a way to do that very easily. There’ll be like three other things we’ll be more curious about to solve as well. So I don’t see that, that innate desire that we have will die down at any point in time.

[00:46:29] Dan: Which is the Star Trek future? Right? That’s how that, that’s, that’s the future they depict there. And for those that are watching, not listening, I’m wearing my star Trek themed, Christmas jumper. We’re recording just before Christmas. So, yeah, I just wanted to end on Star Trek. Now I want to get that in there before Dara starts firing questions at you.

Rapid Fire/Outro

[00:46:46] Dara: No, I liked it as well. It was a nice optimistic answer. I think maybe I’m slightly more cynical, but I liked your answer. Okay. Five, five, final five, rapid fire five. So first question, what’s the biggest challenge today that you think will be solved within five years?

[00:47:05] Gaurav: I think, being able to I can speak from data very quickly, but in a different application, I think being able to with advances in computers, resources that we have being able to solve problems that we don’t know yet, for example, pollution, climate changes, being able to solve that at a much higher scale because of the advancing advancement in computer probably will be the biggest enabling enablement, in my opinion, and will probably help Not solve, but reach one step closer much faster than we thought we would.

[00:47:38] Dara: So what will be the biggest challenge in five years?

[00:47:41] Gaurav: I do yeah one thing that’s very top of mind is responsible usage of anything, of technology to say so. I don’t know if you saw on LinkedIn, like someone basically went to Chevrolet’s website and basically trained their GPT to offer a legally binding contract to sell them at Chevrolet. Camera camera or something for 1 and the GPT actually said like, yeah, it’s a legally binding offer that you’ll get a share related to some 2024 car for 1. So I think, how do you get to that responsible usage is going to be a big, big challenge and legal is going to have a pretty big time. Very tough time to say. So as we go into the future.

[00:48:27] Bhav: I’m going to order a Chevrolet straight after this episode.

[00:48:31] Gaurav: I’ll send you, I’ll send you a screenshot of that, Bhav. It’s, it’s amazing yeah.

[00:48:35] Dara: So, so what’s one myth that you’d really like to bust?

[00:48:39] Gaurav: I would love to have a little bit more visibility around the social impact that people are making by saying that, yeah, we are contributing to. Making the world a better place, right? Or organisations are saying that we are doing carbon emissions that, like, we are, like, cutting carbon emissions by doing this, but I think that’s still a big myth when you talk about, like, oh, yeah, are you actually going towards sustainability?

[00:49:09] Gaurav: How can you prove that? Because I still don’t believe the fact that it’s actually doing the job that they’re claiming it to be. There’s still a lot going on.

[00:49:17] Bhav: Do you think B Corps are just another, enterprise, business that’ll get paid for the number of B Corps licences that they hand out? it’s a good response, actually.

[00:49:30] Bhav: I’ve always wondered this myself. Yeah.

[00:49:33] Gaurav: So that’s, that’s one thing that I think I would love for that myth to be busted out there. What’s the actual social impact or environmental impact that we actually talk about? Yeah.

[00:49:44] Dara: And if you could wave a magic wand and make everybody know one thing, What would that one thing be?

[00:49:51] Gaurav: not related to tech, but generally speaking, the reality of the world. It’s God, I feel like my answers are very cynical. It feels like they’ve been depressed around it, but I think, having travelled different places, And having the opportunity or the luck to be able to kind of view different mindsets or view different people that work with them.

[00:50:11] Gaurav: I think a lot of us do not have full visibility on how different cultures work or how different people work. So being able to be mindful of that and respectful of that, if people can learn that somehow or basically be aware, maybe not learn is a strong word, be aware of how different cultures behave or how different people.

[00:50:31] Gaurav: React how different people respond to information or how you should talk to them. That would be a very good thing to kind of learn. I think we are aware of that.

[00:50:41] Dara: Excellent answer. Yeah. Love that. The last one is easy, I promise. And it is rapid. So what’s your, what’s your favourite way to wind down outside of work?

[00:50:49] Gaurav: I genuinely like being out in the open. So I, I, I do, or I used to do, and I’m getting back into it, doing a lot of triathlons and runnings. So it sounds contradictory, but I really like going out on a long run, which is like going out for three hours, which really mentally winds me down because at that moment, you’re just focusing on running.

[00:51:14] Gaurav: You’re not doing anything else. Actually, some of the biggest decisions that I’ve made came out during that period of time, because you have the clarity of your mind that you can think through things, so basically just wind down and when you get back home. You basically can just eat something and sleep off there’s nothing better than that.

[00:51:31] Dara: I get that. As a long distance runner, I can, I can relate, completely agree. Brilliant. Thank you for answering those not so rapid, rapid fire questions, and for the conversation in general. It’s been, it’s been great.

[00:51:44] Dan: We appreciate your time. Thank you, Gaurav. Thanks for coming. [00:51:46] Gaurav: Thank you for having me, I really appreciate it.

Written by

#99 AI in Self-Serve Analytics (with Gaurav Tiwari @ Spotify)

Transcript

Intro

Topic

Rapid Fire/Outro

Will Hayes

Further reading

The data road trip: enjoying the journey

Measurelab’s Madchester Spring Break Unconference

Customer Data Platforms (CDPs)

#99 AI in Self-Serve Analytics (with Gaurav Tiwari @ Spotify)

Transcript

Intro

Topic

Rapid Fire/Outro

Will Hayes

Subscribe to our newsletter:

Further reading

The data road trip: enjoying the journey

Measurelab’s Madchester Spring Break Unconference

Customer Data Platforms (CDPs)