#140 Taming BigQuery costs with Alvin.ai (with Martin Sahlen)
Martin Sahlen, CEO of Alvin AI, joins the Measure Pod to discuss automating BigQuery cost optimisation through billing model arbitrage, with no AI in the engine.
Dara and Matthew return after a break with a bumper news roundup covering frontier model updates, Google I/O and Next, Anthropic's Mythos and Project Glasswing, and the latest AI design tools. The main event is the official reveal of SEAM (Semantic Engine for Agent Mediation), Measurelab's new product that sits between LLMs and downstream data sources to bring governance, intelligence modelling, and consistent answers to AI-powered workflows.
More from The Measure Pod
Share your thoughts and ideas on our Feedback Form.
Follow Measurelab on LinkedIn
"If you just ask a question of a data source raw to Claude... it was returning the correct answer less than 50% of the time. With [SEAM], high 90s." - Matthew
"It's recognising that people are gonna be doing this anyway, and it's putting that layer in between their LLM of choice and all the different data sources... ensuring they're gonna get back validated answers." - Dara
Show full (AI-generated) transcript
[00:00:00] Lizzie: Hello, and welcome to "The Measure Pod" by Measurelab, the podcast dedicated to the ever-changing world of data and analytics with your hosts, Dara Fitzgerald and Matthew Hooson. Between them, they've spent more years than they'd like to admit wrestling with dashboards, data quality, and the occasional Google curveball.
[00:00:32] Lizzie: So join us as we share stories about how analytics really works today and where it might be headed tomorrow. Let's get into it.
[00:00:41] Dara: Hello, and welcome back to "The Measure Pod." I'm Daire, joined by Matthew. Hello, Matthew. How are you? I'm okay, but it seemed like you forgot your name for a second there.
[00:00:52] Dara: It's been a little pause. It's been a little while. You may... our regular listeners might realize that there's been a few, a few gaps. So yes I'll [00:01:00] confess, that was a slightly rusty opening. Who am I? Daire. Yeah. Yeah, I remember. Who's the
[00:01:05] Matthew: other guy? Oh, yeah, Matthew.
[00:01:06] Matthew: Yeah.
[00:01:07] Dara: Yep. Yeah. Yeah,
[00:01:10] Matthew: I'm all good.
[00:01:10] Dara: How are you? Yeah, I'm all right. Yeah, just trying to remember how to do a podcast. But otherwise, I am, I'm good. Yeah. Yeah. Trying to think how much detail to go into 'cause it's been a while since we've done, and confession on my part, but I think you realize this, some of our, I think a couple of episodes now, we haven't done our usual our usual kind of starting intro.
[00:01:34] Dara: So I'm also trying to search back through my memory and think, what was... what did we last say? What kind of updates did we last give? So the simple answer is I'm good. Yes. The world is ever-changing. We're still here. And here we are. We are still here. Yeah. Which is good. But we've got-- We'll do a, we'll do a news round-up which could be interesting because it's been a little [00:02:00] while.
[00:02:00] Dara: So yeah, some news. So where do we start? Let's not... We're not even gonna attempt to do this in any kind of chronological order. No. Especi-especially considering I had Claude help me gather some news, and even though I said from the start of April, I think it's gone to the start of April 1995 or something.
[00:02:20] Matthew: Yeah, because
[00:02:21] Dara: it's pulled back all sorts of weird stories that are definitely not recent. So let's, yeah. Debbie, Debbi Shasabis found on DeepMind. Oh, no, he knows about him being born, actually. So yeah, may- maybe no particular chronological order. Where do you wanna start?
[00:02:41] Matthew: I don't know.
[00:02:41] Matthew: There's been a, there's been a few events. There's been-- But maybe ticking off some models that have come out. So yeah, since March the all of the AI, frontier AI model people, apart from Google, weirdly actually, but Claude and OpenAI have just been doing the point to going .4, .5, .6.
[00:02:59] Matthew: [00:03:00] So .7-- 4.7 for Opus came out. I think that's the only one that's come out since the start of March. I might be wrong.
[00:03:07] Dara: Do we, do we-- Trying to think how we've got a bit of news to cover. Do we talk about each one? I already want to talk about, I wanna dig in already on 4.7 'cause there's been mixed responses, right?
[00:03:18] Dara: Or do we just... are we just rattling things off here? That's... let me rattle
[00:03:24] Matthew: off the rest of the... So 5.5 is the only other one I know about from Claude from ChatGPT, and then Google's just still chilling with 3.1 as far as I'm aware. But yeah what's your
[00:03:36] Dara: spicy take on Opus 4.7?
[00:03:38] Dara: I I'm not sure I have that much of a spicy take. I haven't used it a huge amount, partly just 'cause I've been terrified. And this is, again, into something else we'll talk about, but I've been just terrified of the limits 'cause I kept hitting them. So I've been a bit of a cheapskate, and I've been trying to use Sonnet for as much as I can.
[00:03:56] Dara: And then even for Opus, I've been using maybe 4.6 [00:04:00] rather than 4.7. So I've only used it a little bit but I know some of the feedback online has been negative. I think a lot of people didn't like it. I don't know if that's died down since, but...'Cause like, you often get that, don't you, when a new model comes out, you just get that immediate backlash.
[00:04:16] Dara: But there were a lot of people saying it wasn't great. We're definitely on the... I think we've talked about this a couple of times, but there's this wave of AI optimism and everything's great, and AI detractors that seems to do the cycle. I think we're at the AI detraction lull at the minute.
[00:04:33] Matthew: I'm sure that will change soon. I've been using it, I've been using it exclusively pretty much. Yeah, I only use 4.7 on anywhere, in CoWork or in Code. And it was a... At the time I was developing... we'll talk about what we've been developing a little bit later. But I was beginning to build that with 4.6, and 4.7 came along, and I couldn't...
[00:04:55] Matthew: Because I was right in the midst of doing a project and [00:05:00] The model got better as I was doing it. It was a marked-- I could tell how much better it was than 4.6, yeah. And I was hitting the same... So for a while it was eating up a lot of a lot of the limits, but then they tweaked it recently, like a couple of weeks ago, they increased the five-hour limit, and since then I've been struggling to hit the five-hour limit.
[00:05:25] Matthew: My week limit is this week is a little bit up there, but yeah this the headroom they've given since is much higher. Yeah, feels hard to hit. I've noticed that too, '
[00:05:34] Dara: cause I was hitting that one like two, three times a day. And now it does seem better. But you've-- Oh, look, this, I'm cl- I'm glad we've had this conversation on the podcast 'cause I'm gonna start using it.
[00:05:45] Dara: I'm gonna start using 4.7 now. Yeah, I was a little reluctant. I think it's just familiarity as well, isn't it? When you-- Because when it first came out, I was just thinking, "Oh," I was in that mindset of at least for things that don't require the higher level, I was just trying to use Sonnet more often, and then I just didn't [00:06:00] really get into the habit of using it.
[00:06:02] Dara: And because you have to switch to it, I'm basically just I've just defaulted to using whichever is on. So I think in God Code it's, I think I'm defaulting to Sonnet at the moment, and it's 4.6 for some of my sessions in cowork. Sorry, Opus 4.6 and then Sonnet in some of the other ones, depending on what I'm doing.
[00:06:20] Dara: But I'll give 4.7 a go because
[00:06:23] Matthew: yeah. Yeah. Interesting, isn't it? That that we're now getting to the point where you can quite ha- you quite comfortably tick along on an existing model. It's not not quite as thirsty for that extra squeeze of the lemon to get that extra bit out of it.
[00:06:37] Matthew: Maybe because they're getting to a point of such power- Yeah ... that those incremental things don't make as much of a difference. I think if I went backwards, I'd probably feel
[00:06:45] Dara: it, but definitely, yeah. If you try and do, if you try and do something on Haiku, it's like you just d- she's "Sorry, I can add two and two for you, but that's about it.
[00:06:54] Dara: That's about... And come up with seven." But yeah. So J- [00:07:00] Gemina, sorry, again, I know we've got a lot to cover, but Gem, do you think you get the feeling
[00:07:03] Matthew: something's brewing? They've got the developer conference, so they've got Google I/O next week. I can't remember when they no- they tend to announce models.
[00:07:14] Dara: I can't remember if they tend to do it at Next or if they tend to do it at I/O, or if they just tend to j- just drop them. But so 3.1 would've come out end of last year, wouldn't it? Yeah, was it like, was it late November or early December or something? I think around then. I think, yeah. Yeah. It's really inter- I definitely can't go into this in much detail 'cause I could probably talk about this the entire podcast.
[00:07:35] Matthew: But I just read a book called "Infinity Machine," which is all about Demis Hassabis, which is why his name was on the tip of my tongue earlier about Google. Oh, you got it. Nice. But it's so interesting that I always assumed Google was sitting on their... they had all this stuff ready to go and they were sitting on their hands.
[00:07:53] Matthew: But as it turns out, reading the book, they were behind. They-- When OpenAI came out with that stuff that Ili- [00:08:00] Ilya Sutskever, however you say his name, he put a couple of pieces together at OpenAI that they hadn't done, and they were really running and playing catch up for quite a while. So it's not like they had it.
[00:08:10] Matthew: They invented the transformer, but there's a couple other pieces that needed to come together to get the LLMs working the way. But yeah, anyway, interesting read, but separate point entire- Yeah. But yeah. Yeah, I don't know. Maybe. But be interested to see what happens in I/O next week and if they've got some other...
[00:08:26] Matthew: They've just done Next, but it was primarily, and I think we both commented on this, it primarily felt like them bedding into rolling out production and productioni- productionizing AI rather than any big groundbreaking announcements.
[00:08:40] Dara: Very much so 'cause we did, 'cause we were planning...
[00:08:43] Dara: I'm giving all the secrets away today, all the confessions, but we were planning on doing, like we did last year, 'cause this time last year we did a full episode on Next, because a lot of that was really new. It was new stuff. Whereas, and we planned to do it again but it actually happened. You're right.
[00:08:58] Dara: It was like, it wasn't [00:09:00] a-- there were a lot of it. I think it was over 200. Like they always have something like 200, 250 announcements, but they, that includes so many small things or those announcements include saying, "Oh, things we announced before are now general availability." So they're not all like new stuff.
[00:09:13] Dara: So yeah, I had the same impression as you. It wasn't quite as... Last year we were so fired up about it all 'cause it was so many like brand new things, whereas this time it was a lot more about, yeah, that kind of like productionizing everything and being able to roll this stuff out at scale.
[00:09:27] Dara: So interesting, but maybe not e- episode worthy.
[00:09:33] Matthew: No, it d-- yeah, it definitely feels like they, they're very enterprise-focused, very concentrating on... Like the headlines, I guess we could flick through them as we were talking about it, but they renamed Vertex. They love renaming, Billy.
[00:09:46] Matthew: They absolutely love it. They renamed Vertex to Gemini Enterprise Agent Platform. So that was... That's, I think, just every, all their AI stuff being covered under there and now being called Gemini. The theory, I think we've talked about this before, that eventually Google [00:10:00] or Google just changes their name to Gemini.
[00:10:02] Matthew: It's gonna happen. It's gonna happen. They released some new TPUs, which is, typical cycle, do it every year. I don't really get the numbers. They're l- they're bigger than the last numbers that they had. This agentic data cloud, so it was... Th- this is interesting. So it's like the sort of lake house that can look at all of these other cloud platforms, and you can have the knowledge center and data catalog of what the information that exists in all of these different sources is.
[00:10:31] Matthew: And y-you've got that sort of agentic layer that can understand data that's sitting in Azure or data that's sitting in over here. So very much a warehouse way of understanding downstream data sources, which we'll talk about a bit later. Nudge, nudge, wink, wink. And then, yeah, they, I think they did a lot of, essentially a lot of statistics about how their customers were moving to increase [00:11:00] tokens, production uses how they were beginning to see savings and unlocks, et cetera from that yeah, we may have missed something obvious that people shout at us about.
[00:11:11] Matthew: What about this amazing thing they announced? But yes yes, that c- is, that could happen. Everything is possible. Unlikely, I suspect, but yeah And you said there were a couple of events. There was another one, I think. Was there an Android event, did you say? Yeah, from Google. From Google, they did a, they did an Android show last week.
[00:11:34] Matthew: It primarily, it's it's not massive. It's related in that they talked about AI a lot, but it was mainly like they're releasing something called Gemini Intelligence, which is their layer between Android and everything else, all the apps and all the rest of it. So it's meant to be more integrated.
[00:11:52] Matthew: They're basically creating a new operating system. So they're ca- they're creating... So the, I think this is the [00:12:00] successor to the Chromebook, but they're creating Google Book. I should, I don't know if I should bother now. Surprised it's not Gemini Book, but they're creating Google Book, which is it contains it's Android.
[00:12:11] Matthew: It's underneath, and then they've got this sort of building on top of that, so there's that going on. And then just some other basic sort of software. But the headline is they're integrating AI into things, which isn't that surprising really. Wow, what a revelation. Yeah. So I'm not a Chro- I'm not a, what's it called?
[00:12:31] Matthew: Chromebook user, and I probably won't be a Google Book user either.
[00:12:36] Dara: No, I have been. It
[00:12:37] Matthew: had
[00:12:37] Dara: an event. Yeah, I have been for client work. Had a Chromebook before, but didn't massively enjoy the experience. It's the same as you. I think what is interesting about the, these events is they, especially like Next is how much ground, it's not quite the right way to put it.
[00:12:53] Dara: It's like cloud the cloud side of Google is growing and growing. It's, in it's not that long [00:13:00] ago that it was a very small piece of their revenue, and I could be wildly off. I have a figure of is it 18% now or something like that? 20%? Maybe it's even higher.
[00:13:11] Dara: It's go- it's going mad. I'm sure, I'm pretty
[00:13:12] Matthew: sure the revenue for, cloud revenue increased something like 48% year on year or something, this is the salt in the salty news, by the way. This is- Yeah, I was trying to, I was trying to desperately just search for
[00:13:23] Dara: a chart. Just throw out numbers. But yes, anyway I just think that's interesting.
[00:13:27] Dara: It's like that that, and how that... I'm interested to see how that evolves given that still even though cloud is growing so much, that they're still so reliant on ad revenue and how that continues to change as time goes on. Yeah. It'll be interesting to see if they put
[00:13:41] Matthew: in...
[00:13:41] Matthew: I don't know actually, 'cause OpenAI have roll- have rolled out further and further their ads experiments. So they started in North America, I think, and they're rolling it out further, which is ads within the chat window when you're talking to ChatGPT on like free tier and maybe the basic paid tier.
[00:13:59] Matthew: Yeah. [00:14:00] And it feels like a natural thing for Google to do, but I don't know if they would do it in like the Gemini app 'cause the Gemini app feels very company-focused and business-focused rather than more general. ChatGPT is just ev- everyone's got ChatGPT, haven't they? Like the Joe Blows on the street knows ChatGPT but has no idea Gemini or Claude exists, and that they still have a large share there.
[00:14:23] Dara: Why would they,
[00:14:24] Matthew: Yeah. So yeah we'll see. But the cloud business is, yeah, is going crazy, and you see all these Gartner Squares that they keep sharing where they're the most innovative and top people in AI and platform provision for AI and all of those things.
[00:14:40] Dara: Okay. So that's the events covered, I think.
[00:14:43] Dara: Maybe we've missed some, but they're the main ones. So where do we wanna go next on our little news roundabout? Meander. Meander. Adventure. News stumble. Stumble. Salty stumble.
[00:14:58] Matthew: It wa- So there was one other event, [00:15:00] but it I think that it can just be part and parcel of Claude our darling Claude, 'cause there's been a fair few updates there.
[00:15:07] Matthew: We've already talked about 4.7. One big thing that we've... that will seem like ancient news to people, but we haven't talked about on here is Mythos So for those who don't know, essentially Anthropic trained a model, another model but decided it was too powerful for mere mortals such as us. And they released like a load of experiments they've been doing with it around searching for vulnerabilities in existing code bases, and it was finding vulnerabilities in 27-year-old code bases.
[00:15:41] Matthew: There was an example where it had found five vulnerabilities in one code base, then figured out how to chain these together to create a sixth vulnerability that was more powerful and allowed access to this, that, and the other. So their reaction was to create like a consortium of people who [00:16:00] can access and use Mythos to shore up infrastructure and code bases ahead of this kind of powered LLM being available generally on the market.
[00:16:14] Matthew: So I think included in that were people like... They called this Project Glasswing this sort of collection of people, and that's I don't know the exact people now. I can't remember. Let me just pull it up. I don't know, it was like Apple so yeah, the people on it were like AWS, Apple, Google, CrowdStrike, which you probably remember from, was that a year or so ago where that went down and crippled half the world?
[00:16:38] Matthew: Was that CrowdStrike? Pretty sure it was.
[00:16:40] Dara: I thought it was Cloudflare, no?
[00:16:42] Matthew: No, it was Crowd-- Some- somebody pushed some accidental vulnerability in CrowdStrike and it crashed Windows. Like- Oh, yeah. Yeah. Palo Palo Alto, NVIDIA, so there's quite a few people who signed up to this. The obvious missing name in there is OpenAI.
[00:16:59] Matthew: And they've recent- [00:17:00] they, like I think I saw yesterday or the day before, they seem to be making their own play in this space and creating their own sort of security vulnerability project.
[00:17:10] Dara: Yeah. OpenAI. We'll just start off. Yeah. We need to do our full episode on them, don't we? Yeah, we need to switch.
[00:17:18] Dara: It all looks very powerful, but, Yeah. So is there any theories on why they didn't-- is that why they didn't take part in that, 'cause they just wanted to do their own thing, or are they just not wanting to play with others? I don't know, to be honest. I don't know.
[00:17:33] Matthew: I think, yeah, I think there, there's just rivalry there.
[00:17:35] Matthew: I don't know if they... I don't know how much they like. They'll, they'll-- They're obviously a big rival, but also, they're all ex, they're all ex OpenAIers, aren't they? Yeah ... Bard and the and the like, so I don't know if-
[00:17:47] Dara: Yeah ...
[00:17:48] Matthew: it's there, there's no... and Google own a half-decent chunk of Anthropic as well, so it's no surprise to see their name in there.
[00:17:55] Matthew: They're spreading their bets a little bit. I'm
[00:17:57] Dara: wondering with, like [00:18:00] All this stuff around Mythos, builds such intrigue, doesn't it? Like how, what... if it can do that from a security vulnerability perspective, what can it do in terms of, productivity? What, how, what will it be like as a model for our end users compared to Opus?
[00:18:17] Dara: You have to think it's gonna be just a different ballgame entirely.
[00:18:20] Matthew: Yeah. There's some... It, that is difficult to say, isn't it? 'Cause y- all of the metrics from the benchmarks that they all use it's hugely a-ahead of a lot of the other ones, like the software engineering benchmark.
[00:18:32] Matthew: It's up in the 90s at this point. And but there's people also saying, "Oh, yeah how did they run the test? Did they just put a load of compute at it, allow it to indefinitely run over and over again? D- how did they get to find these vulnerabilities? Was it as simple as find the vulnerabilities, like I found five, or were they actually...
[00:18:51] Matthew: Was the equivalent spend on like the Anthropic API that a normal human would use just out of the realms of possibility?" So there's that [00:19:00] kind of bubbling underneath the surface of but I-I've seen a few people who are very level-headed about this stuff and talk and typically call out these kinds of things as marketing plays, th- saying that it isn't.
[00:19:17] Matthew: Because essentially they've come back with some proof in the pudding, with actual vulner- provable vulnerabilities that have been discovered that no-nobody else and no other AI system has been able to find for 30 years. So there's some evidence there, isn't there that's perhaps not been...
[00:19:31] Matthew: it's a different beast, yeah. So yeah, it's interesting, and they keep... it's strange now that you see on every metric and everything they put up on a board now, you see they, they put Mythos in there as well, even though we can't actually- Yeah, no
[00:19:43] Dara: one's got their hands on it as yet.
[00:19:44] Matthew: No. And there's... I've some people are annoyed they don't have access because the argument I've heard is, so all of these rich, large companies get to shore up their code bases, but our thing, and that is probably built on [00:20:00] everyone else's code bases. Like the... And everyone, there's more people than ever vibe coding and creating things, so there's more vulnerabilities than ever being created.
[00:20:08] Matthew: So by gating it just to the rich and powerful companies, you're stopping people from being able to shore up their own code, and like smaller businesses might be extremely vulnerable at the point at which they do release this. By not allowing them to access it now.
[00:20:23] Dara: It's tricky, isn't it? 'Cause you do-- I, like I agree with you, obviously.
[00:20:27] Dara: I think any decent person would think that way, where it's just, it's a kind of inequality. But maybe they're thinking, like they, they also have to be careful, aren't they? That the wider that net is, the more risk they're actually gonna expose it to, that the wrong people will get their hands on it.
[00:20:43] Dara: So if you kick it out- Yeah, you'd love to know, like I'd love to have been a fly on the wall when they were discussing that, because it's not an easy decision, is it? If if you push it out to anyone who asks for it, then it's more likely to get into the hands of bad actors. So maybe they're [00:21:00] thinking, I can't be in on just, not really just finance, trying to understand it.
[00:21:03] Dara: Maybe they're thinking those bigger players will have a lot more of the kind of foundational technology on the web." But if they can... it's like banking and things like that, isn't it? It's like they, they're both-- It's understandable they might prioritize certain, yeah, certain industries.
[00:21:19] Dara: But at the same time, you're right. Like that means then that if they suddenly go, "Great, we've done what we can with it, now it's released to the wild." Those hackers are gonna just jump on every single small or medium-sized website or app platform and try and look for those vulnerabilities, and it's gonna be, it's gonna be a sprint to see who can get there first.
[00:21:39] Dara: So yes, there's hope. They do specifically call out critical infrastructure,
[00:21:43] Matthew: which I, like I take to mean things like power grids and hospital records and- Transport and- Transportation. Yeah. Yeah, like literal infrastructure as their main focus right now. So we'll see.
[00:21:58] Matthew: Maybe there's a phase two [00:22:00] where they... g- Gemini sorry, not Gemini. GitHub has, have released a few things in this area. So like I noticed the other day, you can scan code bases for vulnerabilities. So I did it with a couple of things that have been built, and it produced a couple of interesting vulnerabilities that I closed.
[00:22:16] Matthew: So there's stuff going on in there. Do you think the next thing, logical thing to do would be to get something like Mythos and point it at GitHub- Yeah ... as a whole, and say like just find, just let that loose in the wild and see what it brings. Totally, yeah. Issues. Yeah. I'm sure and just contact them and say, "By the way, you've got a vulnerability that you need to close because this, it's gonna go on the market soon and people are gonna find it."
[00:22:37] Matthew: Yeah. It's fascinating. Really is, yeah. Yeah. What next? Claude, they're-- rounded off Claude, I guess this, there's a, they, at their developer conference, they were, they announced stuff like the SpaceX compute partnership, which triggered them to lift a load of limits for like [00:23:00] their pro users, which of what we talked about earlier, that's based off that.
[00:23:04] Matthew: They are going like, I think I saw, and again, it's all the news, I have no actual facts or figures, but I s- I saw their their current growth in terms of like users and revenue is if it carried on at this pace, they would be like one of the biggest companies ever or something. The rate at which they are growing is pretty unbelievable.
[00:23:24] Matthew: So it seems like at the minute they're out just trying to get a hold of compute from making all these deals with Google, AWS, Space, bloody SpaceX for other compute resource to try and under- underpin this new surge in users they're getting I listened to
[00:23:43] Dara: I listened to a podcast with their, I think it's like their chief growth officer or growth something or other, like a senior, yeah, whoever's leading their-- But it's it was funny because it's like you wouldn't think they would need, you almost wouldn't think they'd need somebody like that at Anthropic.
[00:23:57] Dara: And the guy pretty much said that on the... He was yeah, like a lot of the [00:24:00] growth just happens. Like we, we just have to be there to try and shape it and, nudge it along." But he was saying that they just think in exponentials. They don't think like normal companies.
[00:24:09] Dara: So when they think of growth, they don't look at a chart that's just going up linearly. Everything is exponential. So- You know, let's try and get 2% this
[00:24:16] Matthew: year.
[00:24:17] Dara: Yeah, it's just like it's just not the way they work in every single thing they're doing. And it was quite interesting actually, 'cause he was like, he was saying that shapes everything they do.
[00:24:25] Dara: So they don't even mind spending money or time entirely on things, even if they don't work out, because they're thinking, we need to just be thinking big picture stuff, and everything's about this exponential. So anything they can do to find these ways to have these, huge growth.
[00:24:40] Dara: But when does that end? Like that, like it's not, surely it can't keep going up like that forever. But this huge ceiling left, there's still huge headroom because the, given the fact just even that OpenAI have, lion's share of the overall users, but it's things like Claude Code, isn't it?
[00:24:57] Dara: It's and team, teams now, and [00:25:00] so if it keeps going up and up, but at some point it'll have to- The small
[00:25:03] Matthew: businesses with
[00:25:04] Dara: small businesses,
[00:25:05] Matthew: they're just really-- that was like a day ago, which is their play to help automate the smaller businesses of like payroll and bus- stuff like that, but like connected into the apps.
[00:25:14] Matthew: They're like they're not like, they're not shy of spreading a wide net.
[00:25:19] Dara: Definitely not, but yeah, really mad. It's mad watching it. And I can only imagine what it's like on the inside. It must be amazing to be on the inside experiencing that Yeah.
[00:25:29] Matthew: I imagine that Mythos is powering a lot of their development at this point.
[00:25:34] Matthew: So they've got access to- Yeah. That's outrageous. Yeah. It's pretty mind-boggling what you can do with the tools that currently exist from our side. So God knows what it's like to have a, essentially a genius level coder at this point just on tap and-
[00:25:51] Dara: exactly, and no limits.
[00:25:53] Dara: Like another another podcast with with Boris Cherny, is it? Boris Cherny. I'm not, probably mispronouncing his surname, but [00:26:00] the, the guy who made Claude Code, and he was just talking about like the equivalent of hundreds of thousands of dollars a month in, to some of the developers that's the amount of tokens they're using.
[00:26:13] Dara: So to have that at your fingertips and not have to worry about paying for it is, having both the power of the tool, but also the unlimited usage of it, it's ... my little mind wouldn't be able to come up with enough uses. I wouldn't be able to handle running that amount of tokens.
[00:26:27] Dara: But, for these people who are running like, huge numbers of parallel sessions and not having to worry about tokens or anything else, it's, yeah, it would be pretty amazing, wouldn't
[00:26:37] Matthew: it? Yeah. When I did... I recently got into using auto mode at night. I'll say, I'll set it on a task that's, that can be looped.
[00:26:46] Matthew: So it might be like, "Look at this section of my code base," or you can loop through and look for vulnerabilities of it like, like this or that, or find bugs or whatever it may be. Just some way of working through things, checking, building, and pushing to a branch that I [00:27:00] can then come and review in the morning.
[00:27:02] Matthew: I would do that continuously if I had the headroom, but I, like I ate- Constantly ... my week's budget with doing it a couple of days. So I came in, I was like, it was Tuesday. It reset on Monday, and I had like unused 75%, so I had to pull
[00:27:15] Dara: back a bit. So yeah, it's always in the back of your mind, isn't it?
[00:27:19] Dara: You're thinking, "Oh, I could do..." There's two, for me, there's two things. It's like you, it's not like you're fully free of the process, so you need to keep the context in your head. So that's one thing that stops me doing more parallel stuff. But yeah, the main reason is that you're thinking, "Oh I'm getting a bit close," and you're mentally trying to budget your quota, and you're thinking if I run too many things concurrently, I'm just gonna have to go outside and wait till Monday."
[00:27:41] Matthew: Yeah. Touch grass. Yeah. Yeah. But yeah, so the, and there's the little bit civil east as well, which is like just little functional changes in Claude Code or little features, a lot of which sound like stuff that, that's been open, in the open source community that they've [00:28:00] internalized.
[00:28:00] Matthew: So there's this loop that'll try and aim for a goal and keep moving around until it reaches it, which sounds very similar to Ralph Wiggum, which is the, There's something we were building a little while ago where we were messing with our own sort of service data and stuff at Measurelab where we were we came up with the concept of, A dream cycle.
[00:28:19] Matthew: So it would like, at night, it would look through its memory, sanitize, grab new information, reorder, re- vectorize things. It was like a-- The idea is human goes to sleep, and that's what their brains are doing. It... Claude Instants now releases something very similar to that, where they can it can commit and clean and adapt its memory systems based on the day's, the previous day's chat. So little bits of stuff like that seems to be appearing in open source, and then they're going, "Yeah, I'll have that," and adding it. But
[00:28:49] Dara: yeah. Yeah, a lot of OpenClaus stuff as well, and all these new, Hermes and all these other kind of harnesses.
[00:28:55] Dara: I think they're... They've had to respond to that and start picking, magpieing bits [00:29:00] up. Maybe that's slightly unfair. But I would imagine they're-- they, they have enough smart people that they will have been thinking about these things anyway. But it's maybe sped up their rollout. We talked about that before, didn't we?
[00:29:10] Dara: Where they've maybe become slightly less cautious about some of the rollouts because they've had to respond to things like OpenClaus.
[00:29:17] Matthew: Yeah, some of the stuff that-- Some of the stuff like, like Ralph Wiggum and... I feel like that was actually an Anthropic or ex-Anthropic person who created that in the first instance, in my opinion.
[00:29:26] Dara: I, I think it w- I think it was. I don't know if they're still there. I'm pretty sure it was. I don't know if it was Boris or not, or someone else, but I think it was one of the senior-
[00:29:34] Matthew: Yeah ...
[00:29:35] Dara: developers, I think. Yeah.
[00:29:36] Matthew: Was
[00:29:37] Dara: using that, yeah.
[00:29:39] Matthew: They're just working out in open, which is nice,
[00:29:41] Dara: yeah.
[00:29:41] Dara: There's also Claude Design. I think, yeah, it's been since we last did a news update, I think, which we both use. I think you've maybe used it a bit more than me. I have u- I've used it, and I have also used... Did we mention Stitch? I don't know if we did at all. That's quite new as well, isn't it?
[00:29:57] Dara: Google Stitch, which is like Google's- yeah. [00:30:00] That's not a- it's not as new, but I feel like it's still relatively new. I don't know if we talked about it. I don't know. These are the kind of, they're touted, so the clickbait headlines are like, "It's gonna kill Figma and Canva and all these things."
[00:30:12] Dara: Whether they do or they don't, who knows? But it's like a, design tool, AI-powered design tool that can then hand off to Claude Code or whatever. And I like both of them. I, I used Stitch first because I used it before Claude Design came out. Claude Design, I've read, is a bit slower.
[00:30:30] Dara: I didn't find it that much slower. It did seem a little slower, but then what I was doing was different to what I'd done in Stitch. But they both seem pretty good to me, and they can be useful just to get projects started and then hand that over to Claude Code, and it's got the design system to work with and kind of reference.
[00:30:48] Dara: You do end up with a better output, a more, visually designed output. All the designers, we probably don't have any designers listening, but they'll be cringing 'cause they probably think it's You know, [00:31:00] terrible thing to do this, but for non-designers, it's quite a nice way to get your kind of visual assets into Claude Code.
[00:31:07] Dara: Yeah, I found... I used Stitch
[00:31:09] Matthew: first as well, but, and then I found Claude Design really good. Especially when I got into a workflow of like I'd built out this site and then I was essentially going through and saying "Let's review this." I want... and building out a des- you build these design systems that have a language of yeah, just essentially all the aspects of things that need to be adhered to with their design, and then just letting it go through and suggest refreshes of pages and things like that.
[00:31:32] Matthew: And then once you've got that, and the stuff it was producing was so-- it didn't look like it was AI. Do you know, there's that look- yeah ... of vibe-coded sites. It just did, it looked really nice. And then you can just click Share with Claude Code and just pick it straight up in the Claude Code session.
[00:31:46] Matthew: So I tried it with web design and just some little artifact, little sort of single page HTML, little applications there just for sharing information internally, 'cause it's so easy to do that now. You don't have [00:32:00] to just write a doc, you can just make a little webpage. And I also tried it w- we've tried it with slides.
[00:32:06] Matthew: It looks pretty good with slides as well, so I'm pretty, pretty amazing with that. And then I started to try and get it to d- generate me some Dungeons & Dragons maps, and it was crap. So I found the limit there. You
[00:32:16] Dara: found the weakness, yeah.
[00:32:17] Matthew: Yeah. 'Cause it's not generating... It doesn't generate images. It puts together code and maybe generates some CSVs.
[00:32:24] Matthew: SV- SVG, sorry. But it doesn't, it does-- it's not generating like images like a Nanobanana, so it couldn't really do anything compelling there. But-
[00:32:34] Dara: Which I think, or maybe I'm wrong now, I feel like Stitch does use- Yeah ... Nanobananas in that. Yeah. So you can get slightly better. But it isn't...
[00:32:43] Dara: Yeah, I think even still, it's not really what they're meant for. They are meant more for a visual mock-ups for building an app or a or website or a set of slides, whatever, rather than, full-on graphic design. Until the up- until the [00:33:00] next update, they say, "Oh, yeah, we do all that now as well."
[00:33:02] Matthew: Yeah, the canvas is very... Not canvas. The design is very young, isn't it? So God knows. It's, I think it's probably still technically in beta.
[00:33:12] Dara: Yeah I would imagine so.
[00:33:14] Matthew: I would imagine probably on the site it's next to the Research preview even, not even a beta. So yeah, a lot of legs in that, 100%
[00:33:27] Dara: Any more news?
[00:33:31] Matthew: Yeah. Loads. Like news from inside MeasureUp.
[00:33:37] Dara: Ah, okay.
[00:33:38] Matthew: Segway.
[00:33:38] Dara: Did I just do an accidental segway?
[00:33:40] Matthew: Yeah. I tried, I forced it. It was a bit forced, but there we go. Yeah, we've been building Which is fun, and we've built something, and we've been talking about it publicly and on...
[00:33:53] Matthew: We've got a site and we've been talking about it on LinkedIn and various other channels, but we've ne- we've not really talked about it on here, which [00:34:00] feels odd at this point. Which is a which means it's not actually real until we talk about it here, so here we go. This is the official announcement of
[00:34:08] Dara: it.
[00:34:09] Matthew: Yeah. So yeah I don't know where to start with it really so maybe from what we noticed and why we decided to build something for it.
[00:34:17] Dara: L- little bit of backstory. What problem did we recognize and what were we trying to solve, and then what is it?
[00:34:23] Matthew: Yeah. Okay, I'll do that.
[00:34:26] Matthew: Let's, let me s- state the problem. So yeah, we- obviously MeasureLab has been in data for forever. Since it's existed. And we've always, worked on the implementation side of things and building out GTM and the all of your sort of, your tracking stuff, and then we've been working in Google Cloud and centralizing data, and we found that not everyone has the resource, the budget the inclination to centralize data.
[00:34:55] Matthew: So th- there's a gap there in the fact that they have all of these disparate data [00:35:00] systems but they can't join them up in a meaningful way. So we've been, we've always been trying to bring people along because we know the benefit you can get when you're able to start to join these dis- these different data sources together, so we've always been looking for solutions there, and try to advise people to pull data and centralize data where possible.
[00:35:15] Matthew: And then we also had the realization that more and more people are getting access to LLMs, and more and more people are hooking those LLMs and these teams' licenses up to these, th- these downstream data sources via MCPs. Governance be damned. They are hooking it up because it's quicker for them to quickly get an answer from GA4 or quickly get an answer from Google Ads via the MCP than it is to submit a request and maybe build out a dashboard and pull these data sets together and model the data and have it all within that governed place.
[00:35:48] Matthew: So essentially the path of least friction is ultimately where people will go most of the time. So then you have, like, when you tie all of your governance and all of your semantics and all the rest of it [00:36:00] to the warehouse directly, you're governing a sliver, and you've got all this other stuff over here that is happening, but it's like the new shadow IT.
[00:36:08] Matthew: It's happening and you don't quite know about it, but different answers are appearing throughout your organization and- you mentioned the word before, exponential. It's happening at a quicker rate, it's happening more often and it's ultimately coming to a head and going to come to a head in a in bad ways where people have just got 20 different answers to the same question.
[00:36:29] Matthew: And then we also noticed one of, one of the fun phenomenons in people getting these Claude Teams or GPT or Gemini, and essentially being locked from using it because of that governance fear. So larger companies may be saying yeah, here's Claude Teams. We're not allowed to hook it up to any downstream data sources," which is essentially a very expensive email drafter and chatbot.
[00:36:55] Matthew: And you're leaving so much potential on the table. So [00:37:00] yeah, those are the kind of things we've been seeing and realizing as we've been exploring AI, data, tracking all of this sort of stuff over the past... we've been playing with AI for a good three years or so now and building things internally on that.
[00:37:13] Matthew: So it's an amalgamation of all those things. So we- Built something to help solve that, which is SEAM. Now, SEAM stands for Semantic Engine for A-Agent Mediation, which you can see why we why we why we shortened it there. But what it is essentially is it's a it's not just semantics, so the name is a little bit misleading there, but we like SEAM, so I don't think we're gonna change it at this point.
[00:37:43] Matthew: But it, it brings in a mult-multitude of different things. It brings in intelligence modeling, saying where data lives and how it's defined and what the canonical sources of those metrics and resources are. It layers in [00:38:00] governance in that you can say what is and isn't needing to be interfered with or have context injected into it to make sure it's returning the right answers.
[00:38:09] Matthew: So it kinda has... You can, you... For example, you can say, "This metric needs to be... Sorry, this source needs to be governed," and it will, won't let a user just randomly ask questions until it's got context passed back to the LLM to then ask the right questions. And it's got a big audit trail of everything that's happening, what's governed, what's not governed, where the gaps are, where the, where you can it is like a cycle of improvement of things through SEAM. So that, that essentially is a middle layer between any LLM and any downstream data source disconnected from a warehouse, so it can... You can essentially leave your data sitting where it currently sits. Could be a warehouse, but it could also be in the dist- systems, and you can unlock AI in, in a number of different meaningful ways.
[00:38:54] Matthew: I wanna stop talking now 'cause I feel like I've been talking for, A long time.
[00:38:58] Dara: And that's it. Thanks for joining [00:39:00] us today. No, that was, that was that's it. That's it. And it's it's hard to know when you're biased, when you've been involved in it. So I was listening to that thinking, would that make sense to the pure first time?
[00:39:11] Dara: And I think it would, but I don't know if I'm biased. But it does. And I think it is that, it's that-- I'm just gonna... I'm at risk of just repeating what you said in, in different words, but it's recognizing that people are gonna be doing this anyway, and it's putting that layer in between their LLM of choice and all the different connectors, all the different data sources that they might be connecting to.
[00:39:33] Dara: And it's creating a layer
[00:39:35] Matthew: that,
[00:39:36] Dara: that's ensuring that they're gonna get back validated answers. So if somebody in marketing and somebody in finance are querying two different systems, there's a, an intelligence model that's saying, "The marketing person's asking for this metric, and it should come from here.
[00:39:51] Dara: Finance person's asking for this," and it has that layer. And it's all audited, so there's a full log of all of that. And anybody who's using it can see [00:40:00] where the information is coming from and know that it's been validated. So it means that people can follow this path of least friction, least resistance, but they can do it in a safe way.
[00:40:12] Dara: So it ticks both boxes. It's not trying to force everything to be governed centrally. It's allowing people to use LLMs to do these things, but it's ensuring they're doing it in a way that's consistent and safe.
[00:40:24] Matthew: Yeah. Yeah, exactly. And in doing so, like we went to solve those initial problems that we'd-- we've essentially been building this for ourselves and realized in doing so "Oh, wait a minute, there's something here.
[00:40:35] Matthew: This could be really useful for our clients or just for anybody who, yeah has this problem," which is on the exponential curve of more and more people having this problem. So we started to realize let's open this up to people. Let's let other people use it.
[00:40:50] Matthew: So to that end, it's built on top of CLI and like a validation engine that we've created. So that's available in NPM for any of the sort of [00:41:00] developer, more technically-minded folks out there. You can go and download that, and that has-- That'll help you build out your intelligence model.
[00:41:07] Matthew: The intelligence model is all YAML descr- definitions with different layers in it. And so you can go away, you can tinker with that and build it out. You can think of it a little bit like like a DBT or data form in terms of these sort of structured ways of identifying things and linking them together.
[00:41:22] Matthew: So that's out there. And then but in building it, what we started to discover like other benefits that we didn't quite originally think about but it seems obvious in retrospect. So one of which is how much how quickly the LM gets to the correct answer w- using it.
[00:41:42] Matthew: So it's obvious if you think about it, but if you just ask a question of a data source raw to Claude, it's-- what it'll do is it'll go to the MCP, and then maybe it'll start searching through things. So unless you've been extremely explicit in t- in your [00:42:00] question and given it every ID, every dataset the definition of what X and Y is, and how you should combine them.
[00:42:07] Matthew: Unless you've given it all of that information, it has to figure things out and go on an exploration and find things. So we found that on like defined metrics and things the drop in the amount of time it took it to go and grab tools was... Sorry, the number of tools it used to go and get to an answer was dramatically reduced.
[00:42:28] Matthew: Like 30, 40% reduction in amount of tools used, 40% reduction in how long it took to get to that answer. And and it was returning the correct answer in the high 90% of the time versus less than 50% of the time with the raw MCP calls, it was returning the right answers. There's all these additional benefits.
[00:42:48] Matthew: Context and tokens reduced 'cause it didn't have to spend tons of context and tokens figuring out what the question is and where the information lies. So yeah, there's these [00:43:00] other things that we're discovering as we go ah, that's another benefit of it, and that's another addition.
[00:43:04] Matthew: So it's hard to, it's hard to not go off in 50 directions when you're describing it as to what it does.
[00:43:10] Dara: It is, and a couple of
[00:43:11] Matthew: things that you said already, but just to
[00:43:13] Dara: go go back over one, which was you said it does work with a warehouse. I think it's important to stress that because I think what we're not, what we're not saying is that this is, "Oh, don't worry about having a central warehouse, just use this instead."
[00:43:25] Dara: Because there are reasons why you might want to centralize data. And people may already have a data warehouse. So this isn't saying scrap that and use SEIM instead. It's saying you can use SEIM in addition. So if someone already does have a warehouse could just be a source just just like going direct to source, just like going to GA4 or HubSpot or where- wherever.
[00:43:45] Dara: So if someone's got a warehouse, that could just be one of the connectors in SEIM.
[00:43:50] Matthew: Yeah. Yeah, 100%. If the large aggregation, machine learning, all the, this, the benefits of the warehouse still exists. Yeah. It's just that it [00:44:00] just... But just when most information and retrieval of data now doesn't actually go via it in a lot of cases.
[00:44:08] Matthew: Teams have just got their little shadow access to MCPs and accounts in that way.
[00:44:14] Dara: And the other point, which is actually
[00:44:15] Matthew: related is that I was
[00:44:16] Dara: gonna come back to is you mentioned around dbt, and so with SIEM it's firstly it's not... It can work with a data warehouse, but also it can work with sources directly that aren't feeding into a data warehouse.
[00:44:29] Dara: But it's also not just structured data. So it can be... Unstructured data sources can be governed through SIEM as well, which is another huge advantage. So you can have things like Slack, you can have things like access to Drive or, things like that where you've got maybe files that aren't structured.
[00:44:49] Dara: So really anything and anything that has an API you could govern and connect to through SIEM
[00:44:56] Matthew: Yeah, and there's a lot of-- There, there's some concepts in it [00:45:00] that I I won't go into, but we'll share like measurelab.ai first and foremost. There's a load of information about SEAM there. We've also done some internal benchmarking and testing, and that's where I was pulling those stats from before that I was spouting.
[00:45:11] Matthew: We've got a, we've got a study.pdf that I can share, we can share in the show notes as well. There's some concepts within within it that we don't need to go into depth with. You can find that information online. But there's stuff like entities, which is a, like a team or company-wide definition of something, be it customer, be it, what does customer mean to you as an organization?
[00:45:31] Matthew: Or what does conversion mean to you as an organization? Or what does conversion mean to you as a team? Are there different definitions of conversion around the company that can all be baked into the intelligence model and allow the LLM to know where to go? But the reason I'm mentioning that is one example we found really powerful internally is like defining a client and being able to point to not just say their data sources or whatever else we may have, but also unstructured information about project [00:46:00] plans, SOWs whatever.
[00:46:02] Matthew: All of these dis-disparate pieces of information that are dotted around in Google Drive, in this source system over here, in, in a Slack thread over there. We're able to point at them all, say what they are, just pull them all into an intelligence ball and the LLM can quickly go, "Oh, you're asking about client B?
[00:46:23] Matthew: Yeah, absolutely. No this, and this. This is over here. That's over there." And it can answer questions so much more richly and so much quicker.
[00:46:31] Dara: Yeah, so yeah, unstructured is a really good point. I'm just, I'm trying to zoom out a little bit and just think about the different audience, the, different people who might be listening to this podcast.
[00:46:41] Dara: So SEAM could be useful to different... sEAM's useful to company level, but the use cases the people using it, they're potentially gonna be less technical people. I'm being very simplistic here, but they might be less technical people, and they don't need to be technical to use SEAM. But then there's whoever's gonna [00:47:00] manage SEAM internally for a company.
[00:47:03] Dara: So we're talking about the intelligence model and YAML files and structured and unstructured data and data sources connections. So there's probably two... It might be a good point for us to split and be a bit clearer about how is SEAM used, so more on the admin side, and then what are the benefits to the end users of it and what would their experience be of it.
[00:47:25] Dara: Just assuming we've got both types of people listening, making sure that we're being clear, because I think this is the thing with SEAM, isn't it? It's like there is very much a technical managing the intelligence model side, but the end users don't actually need to know about any of that.
[00:47:39] Dara: They just need to know they can reliably ask a question through Claude or through ChatGPT or whatever, and they're gonna get a reliable answer back.
[00:47:47] Matthew: Yeah, that's one-- I suppose one clarifying point then is like, so an end user essentially connects to the SEAM MCP and SEAM acts as the sort of conduit to all of [00:48:00] your other MCPs, all of your downstream data sources.
[00:48:02] Matthew: So you can imagine within Claude, you could turn everything else off and just have SEAM turned on, and anyway SEAM will surface all of the tools and everything that are in all the down- all of the MCPs that you have connected to it. So you-- it acts as a proxy to any call to downstream data, which is how it makes sure the governance is ensured.
[00:48:20] Matthew: A lot of other systems tend to be like a off to the side call, retrieve information, then go to the system, but the LLM doesn't have to use it. It could just bypass it unless you're very explicit and build out skills and flows in that way. So
[00:48:34] Dara: and e-even, sorry to stop you there. E-even there, I'm just gonna say, so if someone's-- just to even add a, abstract that even more If you're listening and you don't know, e- even NCP, so it-- in, and if you're using Claude or if you use ChatGPT, you can connect it to other tools.
[00:48:53] Dara: So you can connect it to things like GA4, you can connect it to Slack, you can collect, connect it to Google Drive, you can connect it to [00:49:00] Google Calendar, Gmail, all these different tools that you might use your day-to-day. How it works isn't even really... it works, and you can go in and manually, you can go in and manually is the wrong word, but you can go in and connect to these tools individually.
[00:49:13] Dara: And what you were saying there, Matthew, is with SEAM. SEAM would do, it would do the work of using the tools of all those individual connections and the governance aspect would make sure it's using them in the right way. But as the end user, all you really have to do is go in and turn something on in, again, Claude or ChatGPT or Gemini.
[00:49:36] Dara: You turn it on, and then that's really all you turn it on and you forget about it. And then the intelligent mo- model, which is the bit I guess we'll get to around how it's managed. If you're an end user, you don't really need to worry about that. You turn this on in, in Claude or in ChatGPT or in Gemini, whatever, and then you can ask the kind of questions that you might wanna ask for your day-to-day, and that could be anything from, "Look at my [00:50:00] calendar and tell me what I'm doing tomorrow."
[00:50:02] Dara: It could be, "Draft an email to Matthew saying we, we need to arrange a time to set up the next podcast." It could be, "Look at Slack and tell me what questions people have asked me this week." Or it could be, "Go into my analytics and tell me how many-" website visitors I had in the last week, and then the list goes on and on.
[00:50:23] Dara: But all those kind of things that you would do in your day-to-day, what this would allow you to do is actually do all of that within Claude or within ChatGPT. I need to stop listing them all out every time. Within, within, within your LLM of choice, you can do all that. Claude is ChatGPT, isn't it?
[00:50:40] Dara: Yeah, including Claude and Ch- in theory, and this is where we are now, as in you and me we're doing most of our work within these tools, but we're still using all the same tools and websites and whatever we would've done before. We're using email, we're using Calendar, we're using Slack, we're using Jira, we...
[00:50:58] Dara: all these different tools. We're [00:51:00] still doing all the same thing, but we're able to do all of that through an LLM b-because of Scene. So it's probably just worth... I just feel like it's probably worth stressing that for anyone who's maybe thinking, "Oh, hang on. Do I need to understand MCPs? Do I need to..."
[00:51:14] Dara: We... You don't. If you're using Scene, all you need to know is you've got to make sure this is turned on, and then once it is it's gonna allow you to do all of these things.
[00:51:24] Matthew: Yeah. Yeah. It's really hard to figure out where to... There there's probably five different people using it at different levels, right?
[00:51:32] Matthew: But yeah, ultimately the point is it's s- it's seamless. That is why we called it that, but I like it. It's pretty seamless in terms of you just suddenly get more power out of what you already had with your LLM. It unlocks AI to move away from being a chatbot to actually start retrieving correct information and even doing downstream tasks for you it's agentic.
[00:51:53] Matthew: So if you're in a position where you're locked in and with Copilot and basically you're [00:52:00] uploading PDFs so you, you're trying to squeeze more out of that particular lemon having Scene just turned on and with all these downstream tools and intelligence modeling within it Would be it, it would be like creating fire for the first time.
[00:52:14] Matthew: It's pretty-- It'd be pretty transformative. So yeah, from that perspective. But and it all works with an individual's user credentials. So essentially you just... There's a little Seam UI that you just have to go into once and just authenticate with any tools you want to use. So if you want to use GA4 in it, you just authenticate with GA4, et cetera, et cetera, et cetera.
[00:52:32] Matthew: I won't be able to list out all the tools like Ira. Like he's in the... Like he works for the BBC or something and needs to be impartial.
[00:52:38] Dara: And I realized after saying that I hadn't listed Copilot once.
[00:52:42] Matthew: Yeah. Yeah. Which is probably a big... That is one of the bigger
[00:52:44] Dara: unlocks actually, Copilot.
[00:52:46] Dara: I think so, yeah. Yeah. One, one, can I-- S-sorry, just let me add one more thing to what I said as well. Just I'm really thinking of the people who are maybe the people who are using an LLM, but they're really not. [00:53:00] They're, people who might think if they hear the word MCP, they start to back away.
[00:53:03] Dara: Not only do you not need to know about that, but you don't even need to then tell it which of those tools or which of those connections to take to, to get the answer from. So if you're someone working in any company and you wanna know how many sales we had last week, the common problem is you might have some people who think they need to go to the analytics tool, some people will need to go to a sales system, some people will need to go somewhere else.
[00:53:26] Dara: If with this intelligence model built in, it's the company-wide decision around where that number lives. So if you're just somebody who's, you've been asked to find out how many sales there were last week, you just want the answer. Seam will go to the right place. You don't even need to think, "Oh, do I go to Google Analytics or do I go to our sales database, or do I go to the sales team?"
[00:53:47] Dara: You just ask the question, and it will bring back the correct answer.
[00:53:51] Matthew: Yeah, you just say, "What, how, do how much did our Instagram follower count increase last week?" That's all you need to say to, to- that's a
[00:53:59] Dara: common question that [00:54:00] I'm asked. I'm asked it-
[00:54:00] Matthew: Yeah. You are a, you're an influencer.
[00:54:02] Matthew: I understand that. But it, that's it. And it could be asking that in any of your existing LLMs, and it will do the... As long as that's modeled, it will go off, and it will retrieve the answer, and it will crucially give you the same answer every time you ask it, and it will give the same answer to anybody else around the company that asks it.
[00:54:21] Matthew: To go back to that, that, that sort of old thing. Tirade. I'm trying to figure out how to describe my opening scene description. It will mean that the same answer comes out across the board, which is the real problem at the minute. If I asked... If I-- We can run through an example of, say, if I asked that same question, how m-how much did our Instagram followers ask increase last week?
[00:54:42] Matthew: And you just ha-had arbitrarily all these sources just connected up without any real sort of semantics or intelligence in the middle. It may go and retrieve that information from BigQuery one day. It may go and retrieve that information from directly from Instagram. It might go and get it from a Slack message that somebody mentioned it [00:55:00] in.
[00:55:00] Matthew: It will just... Whatever it comes across first. It might include today, it might not include today. It might change, There's just a lot of variables that could affect what that answer looks like, and that's the point of that middle layer, of that intelligence layer. All of those variables are thought of and defined so that it keeps giving you the correct thing back again.
[00:55:19] Matthew: That's semantics in a nutshell. We sp- we did a webinar in December where we were looking at conversation analytics, which is essentially what this is i-in as well. It's one of the things it is. But just raw questions to data just doesn't work. It will give you the wrong answer.
[00:55:33] Matthew: You need this bit in order to make sure you're getting the right thing back. Which is understandable as well. It's not even a, it's not even a failing of the technology. It's not And I think this happens, doesn't it? People go, "Oh, AI doesn't work very well 'cause I asked it this question without giving it any context, and it gave me the wrong answer."
[00:55:49] Dara: Of course it did 'cause it doesn't know, it doesn't know what you mean, and this takes care of that. It's saying, "Someone asked this question. This is what that question means, and this is where the answer needs to come from," and it dishes the context behind that. [00:56:00] Y-yeah, it's only I-it's to go back to the...
[00:56:03] Matthew: Not the old world. To be, to go back to implementation. It's only like setting up implementation on a website, isn't it? You could just p-point a raw Google target your site and put GA4 on it, and you would get event data. But 90% of people aren't gonna stop there. They'll start defining specific events and activities and conversion journeys and funnels that occur for them individually as a business.
[00:56:27] Matthew: That's where you get the power out of the tool. So it's just the same concept. It's just because AI can be so convincing in its answer, it can k- it can go, "Actually, no, this is definitely right. What are you talking about?" And you go, "Okay." That's the kind of difference. It's not a, it's not an absence of information.
[00:56:42] Matthew: It will provide the information regardless of if it knows the answer, the true answer or not. So it's more dangerous in that way. Yeah. In a nutshell, there's a couple of ways you can do it. So you've got... you can manage this whole thing in an IDE in in Visual Studio, and you can install Scene core and Scene CLI, and they [00:57:00] allow you-- they have all the information about how the schemas and how all these things should hang together.
[00:57:05] Matthew: So you can build out your YAML definitions of connections and resources, entities, metrics, those sort of different definition of types. You can define all those, say "This calls this, and this is related to this, and this is how you define this metric, and you... This is the canonical source of that metric."
[00:57:24] Matthew: You define all that in these YAML, and then you can do Scene validate, and it'll tell you where there's issues, where there's errors, where there's gaps. So you can validate things in a similar way to you would with Dataform or with DBT or those kinds of tools that people might be familiar with.
[00:57:38] Matthew: So that's more of the sort of perhaps the data and the analyst route where you wanna get your hands on the raw definitions and build them out. There's a, there's... we ship that with a Claude MD file because Claude is... It- it's there to help you build out these these definitions to not just have to do it manually.
[00:57:56] Matthew: You can help... claude or the LLM can help you [00:58:00] build these things en masse. That's very much the manual, get your hands dirty, get into the model route. That all syncs up to GitHub and GitHub is ultimately where the central place of everything is. It's where the information lives.
[00:58:12] Matthew: That's what syncs up with Scene, and where Scene grabs its model from is via GitHub. But you can also then do it via the Scene UI in something called Canvas, which is more of a point-and-click UI way of... Keep knocking my mic. A point-and-click UI way of defining these different things and hooking them together and being able to visualize how they're all hanging together.
[00:58:35] Matthew: So you got two rough routes in. More UI-based, working through it in that way, visual definitions and the raw underlying information. So we try to cover both, Both routes there.
[00:58:49] Dara: Yeah, and you can use a combination of both. It's not one or the other either. You can use either of them or you can use both of them, because we wanted to, as you said, we wanted to give people...
[00:58:58] Dara: So it would mean that someone [00:59:00] comfortable working in an IDE can and is familiar with YAML can go in and they can do all of that editing at source. But equally, somebody who wants to go in who, maybe somebody who has more of the actual business logic and wants to go in and see it in a more visual way and say that's not right.
[00:59:14] Dara: That's not actually where that should come from. That metric should come from our finance system. It should come from Xero or whatever." Then they should be able to go in and make that change without needing to understand YAML or know how to use VS Code or whatever. So it's the same thing, but it's two different, two different routes into it.
[00:59:31] Matthew: Yeah and you've got stuff like if you're defining this stuff internally so as we said, a user doesn't have to worry about any of this. They could be asking questions of their LLM happily without... they're ignorant to the fact there's an intelligence model in the center.
[00:59:43] Matthew: But if they start asking questions that aren't mapped to the intelligence model, then that will be highlighted in, say, Canvas to show "People are asking these questions of sources that we don't have defined in our intelligence model," and you can then pick that up and pull that in. So without the user ever knowing about it you're mapping [01:00:00] and building out your intelligence model more and more because it's capturing those gaps.
[01:00:04] Matthew: Which is a really nice way of having that feedback loop of an ever-improving, ever-improving model. Yeah and there's all sorts of other... There's all sorts of things inside of the UI, so the admin side of things. Obviously, you can see an audit log of w- all of the sort of different tools that are being used, what governance is being added to those tool calls to, to increase their effectiveness and their recall, et cetera.
[01:00:27] Matthew: You can see like the percentage of g- of tools and questions that have been governed or not governed, what tools are erroring across the board. Like you get a kind of a snapshot of the health of your company's AI usage, essentially. And I think the other benefit of this that, that we're discovering is it's a bit of an accelerant to AI adoption because people...
[01:00:50] Matthew: We've always described like when somebody finds a use case for them, when somebody realizes "Oh, it can do... AI can do this thing. Oh, I can send it off and do this task," as like the light bulb [01:01:00] moment where they go, "Ah, okay, this is bigger than I thought it was." And because we're able to facilitate people just in their day-to-day with all of these suites of tools that answer back reliably and in an intelligent way and can take actions reliably in an intelligent way, the frequency of those light bulbs going off is increased massively.
[01:01:19] Matthew: So if you're in a, if you're in a position where you're like, "I just need people to start getting this and adopting it and using it," which is tricky. It's very hard for some people to, to grab that concept, understandably seem kinda helps we've observed with ticking that along.
[01:01:35] Dara: Yeah, and that audit log, like it's quite, it's granular as well, so you could even see things like just to try and... 'Cause it's hard with all this on a podcast. We're trying to, like it without any visual aids- It's
[01:01:44] Matthew: a real test
[01:01:45] Dara: of our- Exactly. To be able to actually explain the benefits, it'd be interesting when we listen back to this.
[01:01:51] Dara: The, is it creating us do a good job of getting it across? But just thinking of the audit log and your point about tr- helping to drive AI adoption, you can also see which [01:02:00] of the tools are being most commonly used. So you get all of this kind of granularity within the audit log.
[01:02:05] Dara: So you could see where people are finding utility, and you can see where they aren't, so you could see the gaps as well and think hang on a minute. We've got we know we've got this connected to," so just take Slack, but nobody's using that. And maybe people don't know that they can do that, or they don't know why they would do that.
[01:02:21] Dara: So then you could help to say to people, "Look, we notice that everyone's using this to check their calendar and to pro- post hog or GA4 or whatever, but nobody's using it to do X, Y, or Z." So that audit log lets the, whoever's managing this, it lets you look at what tools people are using and what they're not using to try and like spot patterns and see.
[01:02:41] Dara: And then what, yeah, what should- The
[01:02:42] Matthew: skills of it.
[01:02:43] Dara: Exactly, yeah.
[01:02:44] Matthew: Yeah.
[01:02:45] Dara: With, without that you wouldn't really know. But you can obviously ask people, but you don't always get the full picture back, so you don't know where people are finding use and where they're not finding use, and this just gives you a bit more a bit more data to work from.
[01:02:58] Matthew: And there's another, [01:03:00] the, of, on that audit log there's another interesting aspect is like every call that a user makes, and essentially the- the route that it takes. So it goes and retrieves data from here and brings it back, et cetera, or has a unique audit ID. So you can essentially allow...
[01:03:17] Matthew: Y-y-- A user could quote "This is the answer I got, and this is the sort of answer ID." And then you can see I know exactly where that came from. It went via this route. It got this information from here. This is how it ran it. This is the quest- so you have a complete sort of accountability and trace of how it came to that answer.
[01:03:36] Matthew: And again, to get that feedback loop you can improve things as you go along. And what that's also allowed us to do, which is, i-is to... You've got all these definitions. You've got your connections, your entities, understanding the different definitions that exist across your organization, your your metrics, where you're defining specific combinations and aggregations of things.
[01:03:59] Matthew: It's allowed us to [01:04:00] build dashboarding on top of all of that because we can trace everything all the way through, so you can get a... You know a dashboard has a concrete answers in it because it's got all of this audit log of exactly how it's coming to the metrics that sit with inside of that dashboard, which is something that is, is missing in pretty much every every dashboard where you just...
[01:04:21] Matthew: People-- How do people trust that information as well? You've got the audit. You've got all this Kairos model. You might as well use it to service that as well.
[01:04:30] Dara: Yeah, and we've no- and we've noticed that there's still... this wasn't a surprise, but with using with or without Siebel if you've got your LLM connected up to all these data sources, you can bypass what a lot of people have previously used dashboards for.
[01:04:43] Dara: But that doesn't mean dashboards don't still have a purpose, and I think for any kind of static information, especially for like management reports or anything like that there's still a need for dashboards. So y- it made sense also to have this kind of dashboard layer in here, and then that's something else that can be defined and governed within [01:05:00] Siebel as well.
[01:05:00] Dara: Like you said, those metrics are rolling up anyway and there's that audit trail, that validation trail, so why not have the ability to piece those together into a dashboard as well?
[01:05:10] Matthew: Yeah and we don't... we're not centralizing the data there. We're essentially creating sort of ephemeral cached dashboards of that information, so we don't retain the information.
[01:05:21] Matthew: We're not building out a warehouse. We're just being able to retrieve and cache and analyze that data in those dashboards in a way that means you don't have to move the data ultimately. Not that there's not a... Like we said before, there's obvious benefits to centralizing data. When you reach that point, you get to the point where you're like, "Now we need to build our ML machine learning models.
[01:05:42] Matthew: We need to do some really heavy aggregations here, there, and everywhere," the warehouse is still obviously the place to go. But again, hook that up to Siebel once you've done it. Don't connect your semantics to the warehouse 'cause you're leaving a lot of gap on the table that, that, yeah, is important to capture.
[01:05:59] Dara: We've probably [01:06:00] forgotten more than we included, but hopefully that has been a fairly decent introduction to Siebel, why you would use it, what it does, how it works. If people are interested, what do they... how do they get in touch with us?
[01:06:15] Matthew: Yeah, so I, I think we've covered most things there, but it's a big concept.
[01:06:18] Matthew: There's lots of different moving parts, so it's Please essentially reach out to us if you've got any questions about it. What we'll do is in the show notes, we'll put the links to MeasureLab AI and any other resources you've got around it. We'll put the study that we created in there as well.
[01:06:36] Matthew: We're beginning to roll this out to customers. So essentially there's a Contact Us form on the Seam website where you can get in touch, and we can talk about the needs you have. We're also looking for partners who maybe want to maybe see that there's benefit to their clients where they could put Seam in there and help them unlock their all these disparate data sources for them and unlock AI.
[01:06:58] Matthew: So if you're interested in being a [01:07:00] partner with Seam, also reach out. Again, there's another form on the website where you can you can sign up for a part- partner program. And yeah, generally just have a read around it, absorb it and give us a shout, and we'll be happy to start getting you going, set up.
[01:07:14] Matthew: It's a s- it... So there's one important thing to say is it's a, it's an ongoing sort of l- license cost, a monthly thing. It's not a, it's not a big outlay cost to, to build it. It's more of an ongoing software cost kind of model that that we're talking about here. Just in case that wasn't clear 'cause of what MeasureLab has historically done, but yeah it's there, it's ready, and we're beginning to onboard people. So reach out.
[01:07:39] Dara: Push. Yeah. Yeah. And again, back to the point about it's hard to take it in over a podcast. So if you book it, check out the website, book in a d- book in a call, and we can show you things rather than just talking about them on a podcast.
[01:07:51] Matthew: Yes. Yeah, I think there's... It's much more compelling when you can show some examples of calls with AI and the interfaces and stuff. So happy to do demos if anyone needs [01:08:00] so, needs one.
[01:08:01] Dara: All right. Let's wrap it up there. Until next time. That's it for this week's episode of "The Measure Pod." We hope you enjoyed it and picked up something useful along the way.
[01:08:10] Dara: If you haven't already, make sure to subscribe on whatever platform you're listening on so you don't miss future episodes.
[01:08:16] Matthew: And if you're enjoying the show, we'd really appreciate it if you left us a quick review. It really helps more people discover the pod and keeps us motivated to bring back more. So thanks for listening, and we'll catch you next time.
Martin Sahlen, CEO of Alvin AI, joins the Measure Pod to discuss automating BigQuery cost optimisation through billing model arbitrage, with no AI in the engine.
Colin Zima, CEO of Omni & Looker veteran, joins the Measure Pod to discuss revolutionising analytics by integrating AI into modern data tools.
Pentagon drama, Google updates and a deep dive into why Claude keeps winning. Agentic AI is here and it's for everyone.