"If the headline of the two events is that agents are the future, the subplot is definitely the governance, the grounding that needs to sit underneath them to make them useful." - Dara
"Tying the semantics and the models to individual properties like Looker, like warehouses, is the wrong way to go about it." - Matthew
Show full (AI-generated) transcript
[00:00:00] Lizzie: Hello, and welcome to "The Measure Pod" by Measurelab, the podcast dedicated to the ever-changing world of data and analytics with your hosts, Dara Fitzgerald and Matthew Hewson. Between them, they've spent more years than they'd like to admit wrestling with dashboards, data quality, and the occasional Google curveball.
[00:00:32] Lizzie: So join us as we share stories about how analytics really works today and where it might be headed tomorrow. Let's get into it.
[00:00:39] Dara: We are interrupting our own episode to bring you this news bulletin. So a new model has landed just to keep us on our toes. We recorded our episode, this episode that you're about to listen to, and just before it came out, Anthropic went and introduced a new model, which is [00:01:00] Fable.
[00:01:00] Dara: Which might be another name for a model that we've been hearing about, which is Mythos.
[00:01:06] Matthew: I think it is. I think they've said as much,
[00:01:09] Dara: yeah. It's great, isn't it, to be able to just say we've built up a lot of hype around Mythos, so let's just call it Fable instead, and then we'll cover it if it's no good."
[00:01:15] Matthew: I think they've, I think they've released two models, but-
[00:01:18] Dara: Yeah ...
[00:01:19] Matthew: but one is, one is Mythos-5 Which is for the elite. That's for the people who sip champagne and eat canapés in castles. But they get to, they get Mythos 5, and then Fable 5 is for the like of us. It's the poor man's
[00:01:34] Dara: version.
[00:01:35] Dara: Yeah. It's the
[00:01:36] Matthew: beans
[00:01:37] Dara: on toast.
[00:01:37] Matthew: Yeah. It's the beans on toast. I think the main difference, from what I can tell, is it has this off-ramp that they've put into it. I think that's the main thing. So if at any point it thinks that you're going to try and get it to do anything in terms of cybersecurity- or bio- biology, it passes the request off to 4.8- Yeah ... and won't tackle it itself, [00:02:00] which happens quite a lot in my experience so far.
[00:02:03] Dara: Oh.
[00:02:04] Matthew: It keeps doing it. Because partly 'cause I'm doing bugs and things like that, but it'll go... It'll say, "Ah, I'm gonna give this to 4.8."
[00:02:10] Dara: Part, I was gonna say partly 'cause you're hacking into...
[00:02:12] Dara: I sh- probably shouldn't even make this joke on a live podcast. Yeah. No, you're doing nothing wrong. You're just, yes bug fixing. But interesting. So I didn't think it would be that sensitive. I thought it would be more around, I thought they'd set that threshold a bit, if it looks as though you're trying to do something very malicious, it would, it would do that.
[00:02:31] Dara: Yeah. But which... I just thought of something as well. It's quite odd that conceptually, it's oh, we recognize you're doing something that we don't approve of, so we're gonna give you our previously best model to do it because it's slightly, 'cause it's a bit less good than this new one. As opposed to if you're really doing something that it thinks you shouldn't be doing through Mythos, then surely you shouldn't be doing it full stop.
[00:02:55] Matthew: If you've managed to get it to trigger another model, could they not [00:03:00] just get it to refuse the
[00:03:01] Dara: request? You would think that would be maybe a bit more... it guess, again, it depends because if it's triggering for you, then obviously it is set, that threshold's set too low.
[00:03:08] Matthew: Yeah. Maybe it's, maybe that would just m- end up completely crippling it, and it would just...
[00:03:13] Matthew: if you said refuse request and people are like, "I just wanted to look- Yeah ... for bugs in my code, and it keeps saying no."
[00:03:18] Dara: Keeps calling me a hacker and trying to boot me out. Yeah. Yeah. Yeah. It's interesting though that you've been seeing that. I didn't consider that would be something that would pop up quite regularly, but, It
[00:03:26] Matthew: definitely does.
[00:03:27] Matthew: But for the things I've been doing anyway it... O- on for sort of greenfield projects where I've been, making code from scratch or something like that, it's not, it doesn't really come up, up as much. But when I'm going into established code bases and getting it to sweep for bugs or fix bugs or- look, try, like assess things from a security or scalability or whatever standpoint- it's been regularly popping up. Interesting. Which I guess makes sense, but yeah. And
[00:03:50] Dara: when it pops up you do get an actual pop-up and it tells you it's switching the model and then does the model- Yeah ... switch back again?
[00:03:57] Matthew: Yes. Yeah, it flicks back and forth. It [00:04:00] seems to just hand off tasks to it.
[00:04:01] Matthew: It uses, obviously, Claude uses a lot of agents anyway, but it seems to maybe s- it spins up a 4.8 agent in order to handle the request and then comes back to Fable again.
[00:04:11] Dara: I've used it. I just... I couldn't... When it offered it to me, I was like, "Yeah, sure." And I noticed a weird thing. I don't know if this was w- a little quirk or if this was to do with, sometimes where they...
[00:04:20] Dara: It's happened a few times now for different reasons, where your tokens suddenly go up to full again. But I switched it on and it said, it gives you the warning saying, "This uses double the tokens of Opus 4.8." And I'm thinking, "Oh, okay, here we go. I'm gonna be maxed out within five minutes."
[00:04:36] Dara: I had a tab open with my usage and I was keeping an eye on it, and it pretty quickly went up to 'Cause mine had just reset and it pretty quickly went up to 16% or something of my session within a couple of prompts, and I was like, "Oh, this isn't gonna last long." And then I went and did something else with it, and then I went back again and it was back down to nine.
[00:04:57] Dara: I don't know if that was- Yeah, I noticed the same thing. Okay. [00:05:00] Yeah, I wonder what that was
[00:05:01] Matthew: about. Shortly after announcement, I think they did the, I think they did the reset again. Does it... I found it... so it seems to be, from what I've been seeing around, it's more token efficient, but is twice the cost of Opus 4.8.
[00:05:12] Matthew: So I've wa- been watching like a few videos of what people have been making with it, and the cost is the tokens, the output tokens that, that it takes, the input and output tokens that it takes to create what they've been doing have been lower than 4.8. But ultimately it still works out more expensive 'cause Fable is twice the price of 4.8, but not twice the cost.
[00:05:30] Dara: Interesting. So I wonder, have you seen anything that hints at what the real net number is then?
[00:05:35] Matthew: Fable, it looks to be really good at creating code bases- ... but also like 3D graphics and and games. So a lot of people have been using it to make like browser-based games and things like that, so that's a lot of the examples I've seen.
[00:05:51] Matthew: Somebody got it to create a clone of Age of Empires, is the one I saw- Wow ... this morning. And they were putting it up against 4.8, so I think the clone of [00:06:00] Age of Empires was something like, it came out at $50 in terms of the API cost compared to, I think 4.8, 4.8 was like 35 dollars.
[00:06:09] Dara: So quite, quite a bit closer then, 'cause I had just had the headline figure of this is gonna cost you double.
[00:06:14] Dara: So that's what I had in my head.
[00:06:15] Matthew: Yeah. No, definitely not. And I've noticed it- Yesterday I was-- I used it all day and I was, I hit at my limit, but normally it was like 10 or 15 minutes to go. So it wasn't noticeably... I found it quicker. I didn't-- I don't know if they mentioned necessarily advertised it was quicker, but I found it quicker in a lot of respects in like just completing tasks much faster.
[00:06:39] Matthew: But then it al- it can also go off and run for hours. I think, I've... It's running on something for me Now, in fact, it's just finished, but it has been going it had been going for 50 minutes just running away in the background. I've got four hours till my limit resets, and I've used 52%. Ooh.
[00:06:59] Dara: I sense, [00:07:00] I se- I sense an afternoon walk going on, or an afternoon nap.
[00:07:03] Matthew: Yeah. Yeah. That's because I've, I set four things going at once. I just kinda got greedy.
[00:07:09] Dara: I have I, I don't, I, I haven't, I don't know about you, I haven't... I wouldn't wanna say categorically what I think of it yet 'cause I haven't used it enough, but I feel like it's better in a way. Don't ask me to pin that down.
[00:07:24] Dara: But the thing, couple things I did with it, I asked it to review a bunch of different things that were going on and p- pull together a complete picture of it, and it just seemed a bit sensible about it compared... 'Cause a lot of the, those individual components of that, what I was pulling together f- I had done with Opus 4.8, and it seemed to grasp it even better than it...
[00:07:46] Dara: In other words, what came back as this kind of meta output was clearer and better than the individual conversations that I'd been having with Opus. It just seemed a little bit more kind of capable of understanding what I was getting at and pulling things [00:08:00] together and making sense of them and spotting issues.
[00:08:02] Dara: Whereas I had been banging my head against the wall a little bit with some of those things with Opus, which might partly be because there were some long sessions, which is something we've talked about before, where probably need to start creating fresh sessions a little bit more often and doing a handover.
[00:08:17] Dara: But it... Just gut feel tells me it's a little better. It's getting a better handle on things. But I haven't, yeah, I haven't done enough to say that, with any real rigor.
[00:08:26] Matthew: The benchmarks are much better, like in terms of what they've got on their site.
[00:08:32] Matthew: Always,
[00:08:32] Dara: yeah. But
[00:08:33] Matthew: again, I think us and a lot of other people are more questioning of these benchmarks than we ever have been. But so like agentic coding so the d- the software engineering bench pro that is 80% now compared to ChatGPT 5.5, which is 58%. So it's like a 30% leap there almost.
[00:08:53] Matthew: The agentic coding has jumped up 20%. The knowledge work has jumped up [00:09:00] significantly. The... There's a lot that's taken a big leap. I've I've seen a load of people doing like side-by-sides with 4.8 and- it's, it seems a much larger leap than we've seen in a good number of years. I've... When 4.8 came out, I pointed out like some code bases just to do some scanning and checking for vulnerabilities, et cetera, and I was getting some useful things back and repairing them.
[00:09:25] Matthew: I did the same with this, even though it handed some of it off to 4.8 itself, but I got a much better response and some really interesting sort of vulnerabilities that I hadn't found previously off the back of it. So I don't know if it's better at... I think it's better at just generally, like you say, its responses are more thoughtful, more like somebody was making a website with it and they compared it to 4.8, and they had OpenAI Image 2.0 or whatever it is, the latest one And both of them also prompted it to get images for the products.
[00:09:58] Matthew: But even the [00:10:00] images it got back from a different model were much better than those that 4.8 produced, because it must have made a better prompt. It must have been able to construct that in a better way to get a better output. So it just... It's, it is a big l- it seems like a much larger leap than we've had in a while.
[00:10:15] Dara: Which, you'd be surprised if it wasn't, right? Certainly the amount they've bigged this up. And not just them bigging it up because, obviously there was a lot of marketing wrapped up in it, but I think you mentioned on the pod before that even people, YouTubers or whatever that you watch, where they're quite cynical about these things, they've actually said that this does look like it's, the real deal, at least in part.
[00:10:34] Dara: I got myself slightly distracted while you were talking there and put it through the foolproof test of the f- the pelican on a bicycle benchmark. I just sent you, I sent you one, and I'm gonna send you another one in a moment, and then you can tell me which one you think is Fable and which one you think is Opus 4.8.
[00:10:53] Dara: This is Simon Willison, I think is the guy's name. He's on a podcast that I listened to a while back, and he talked [00:11:00] about his... I think he... This is his benchmark, which is obviously a bit, it's a bit of fun as well. But you can see all the history on his blog about, like, all the different models and how interesting some of the pelicans look.
[00:11:11] Dara: But the one I just sent you does look like a pelican on a bicycle.
[00:11:14] Matthew: C- for some of the other ones that I looked at before of that particular benchmark, that is... I can see there's a sun in the sky, there's green grass, there's a red bicycle. There's some sort of bird. I can identify it as a bird.
[00:11:25] Dara: Yeah. We'd probably need a pelican expert to tell us how, how realistic it is as a pelican. But yes, it is a bird. So I'm gonna send you a second one now. We should maybe share these with the, for the show notes, just random pictures of pelicans. So this is the second, and I'm not gonna tell you what...
[00:11:43] Dara: it's a random order here. So- Oh ... which do you think... I think they're both pretty good.
[00:11:48] Matthew: I would guess that Fable is the first one.
[00:11:56] Dara: What makes you say that?
[00:11:59] Matthew: [00:12:00] Because the grass has got some shading on it and some growth and that's it.
[00:12:05] Dara: You're right. Yeah, I-
[00:12:08] Matthew: Yeah, the grass looks a bit better, I think.
[00:12:10] Dara: Yeah, the grass looks better, and I think the legs being the same length is, a hint of accuracy. I'm sure there's pelicans out there with legs of differing lengths, and I'm not looking to exclude them. But the second one the legs are matching where the pedal is at, so they're- they've become different lengths.
[00:12:28] Dara: But then the Fable one has got three... It looks like it's got three parts to its beak, which is not qui- that's not how I know pelicans. Again, maybe I'm not an expert. I'm not an expert. I feel a bit sorry for the second one as well. He looks like he's working a bit too hard on the bike.
[00:12:42] Matthew: We have to share these.
[00:12:43] Matthew: This is the worst podcast ever. With just two people describing terrible pictures of pelicans.
[00:12:50] Dara: But anyway, there are more, yeah, there are more kind of conclusive benchmarks out there. But I'm, I- I'm keen to put it through its pace a bit more and challenge it with a few different bits and pieces.
[00:12:58] Dara: And another... Sorry to, [00:13:00] I feel like I'm derailing this with silliness around pelicans and also the next thing I'm gonna say, but you notice there've been these kind of there's always these gripes online as a new model's about to come out where people say the old one's getting worse. I have to admit, I've been getting more annoyed with Opus And then this-- and I don't, I'm not one to say, I'm not gonna say that, correlation does not equal causation, but having said that, I've just been finding Opus getting a little bit sloppy in places.
[00:13:28] Matthew: Maybe there's something in it. I've got my theory that pe- we just get... there's a time, even if something was the same level of value, as we get used to it, our our patience wears thin. But it, but it-- I'm not ruling it out. Maybe it's a combination of the two.
[00:13:44] Dara: And even I wonder if this could be completely, this could be complete nonsense, but could there be a kind of technical reason?
[00:13:49] Dara: If the resource is being pushed in a direction, maybe there is some degradation of the... I don't know. I don't know. Again, I don't know the ins and outs of how all that works, but I wonder if their focus is [00:14:00] elsewhere and the bulk of the work is elsewhere, then maybe there's something to it.
[00:14:04] Dara: But who knows really? And yeah, it is hard to know. I think, I do believe what you've said, which is that, the more it offers, the more you take for granted, and the more frustrated you get with it not being able to do something, 'cause you're like: "What? Hang on. Why can't you just do everything perfectly?"
[00:14:17] Matthew: Yeah. And it's so funny, like it, the-- I'm getting annoyed. It's pr- it can produce basically entire websites now. Even 4.8 could do it just on, off a single prompt, and getting frustrated when it doesn't look quite the way you want it to. And I, and yesterday, when I was posting about Fable and stuff into the, into our internal Slack channel, I realized, reading another post, that Claude Code is one.
[00:14:38] Matthew: Yeah.
[00:14:38] Dara: I know. It's insane.
[00:14:40] Matthew: Yeah. We've been-- It's only been around a year.
[00:14:41] Dara: Yeah. So that's plenty of time. Why is it not perfect in every way? Yeah.
[00:14:46] Matthew: And the, and I remember the marvel of "Oh, look, it's just turning all my files" in doing it, and now it's now it's like spitting out pelicans left and center that are nearly pelicans.
[00:14:54] Dara: That are identifiable as a pelican, what more do you, what more do you really need?
[00:14:58] Matthew: And I suppose the other big thing to say [00:15:00] really is that you've got until the 22nd, I believe, of June to take advantage of this, because at that point, they're removing it from subscriptions and moving it over to a usage-based, API-based access.
[00:15:16] Matthew: So if you-- So you'd have to turn on extra usage in your Claude subscription to get it, and you'd have to pay the API. They claim they're doing that just because they don't know if they can handle the demand yet, but they want, as soon as they can, they want to add it back into subscription.
[00:15:32] Matthew: So it's not... there's all these theories that it's a g- give them a bit of crack, and then they'll come back for more, and they'll start paying all the API costs. Maybe that's the case, but they're claiming it is literally just so they can ease it into subscription models and not gobble up all compute at once.
[00:15:51] Dara: Which would be reasonable. But yeah, hopefully it does make its way back to the... 'Cause it is, 'cause if you- Let's not stick with that analogy. Let's say cakes. You have a nice tasty cake. Let's keep this,[00:16:00]
[00:16:03] Dara: let's keep this, let's keep this... it's not really, it's not really family-friendly probably anyway, but let's keep it, vaguely. You have a nice cake and then you think, "Great, I'll have another piece of cake." No, you can't have it. So that's, that's never fun. Yeah.
[00:16:16] Matthew: It's a funny, it's a funny paradigm though, isn't it? Because you've got a product that everyone ultimately is happy with, and is like chugging along and is raving about this product, how good this product is, and then for the same price, for no additional cost, you suddenly... the product suddenly gets much lo- much better.
[00:16:34] Matthew: There's not many, there's not many walks of life where like just sudden unprompted giant leaps in capability occur on a s- piece of software or a product, and then people would then instantly be jaded by your original perfectly working fine, everyone was happy with it service and want this thing.
[00:16:53] Dara: It's ba- it's peculiar, isn't it? Yeah. Yeah. I guess the same maybe happened a little bit with the internet, but it... this is almost that just [00:17:00] on a another scale again. And it is... And the pricing thing, and okay we're in danger now of making this an episode rather than just a...
[00:17:07] Dara: I say in danger, it al- it already pretty much is an episode. But we... The, another interesting thing is at what point does this all start to just become even more commoditized and there's and if there's still th- three major players in the race, then the price is gonna start coming down. So then it's like it's gonna be, it's getting better.
[00:17:22] Dara: 'Cause Google did, Google have changed their pricing already this week. So yeah, it's if the pricing starts to come down, then it's like you're gonna be getting more for less in this sc- it's a weird... The whole thing's quite weird really, isn't
[00:17:35] Matthew: it? Y- yeah, and I think we mentioned this.
[00:17:36] Matthew: We might even mention it on this podcast that's about to come, I don't know, or the one after. But there's a, there's definitely a maybe getting to a point of People don't necessarily need Fable. If you're coding all the time and you're working with large code bases and, or creating new things, it's almost most certainly gonna be better.
[00:17:56] Matthew: But for just day-to-day bits of stuff, 4.8 was great. Yeah, it really [00:18:00] was,
[00:18:00] Dara: yeah.
[00:18:01] Matthew: It was, like, unbelievable. So-
[00:18:02] Dara: Yeah, but, yeah, but-
[00:18:03] Matthew: Yeah ... yeah, but
[00:18:04] Dara: I want Fable.
[00:18:06] Matthew: That's it. Yeah, and that's always been the narrative so far, latest and greatest, but maybe the current is so good that starts, stops to be as...
[00:18:13] Matthew: Certainly for a big enterprise that are struggling with cost of APIs and things like that, if they're getting, if they're getting the bang for the book they want from the current models, they might stick for a lot while longer.
[00:18:22] Dara: Okay. Let's leave it there and get back into our our regular episode, but this was important enough to drop it in here.
[00:18:28] Dara: But yeah, on with today's episode. Hello, and welcome back to "The Measure Pod." I'm Dara, joined by Matthew. I know who I am this week. That's a good start. Matthew, how are you doing today?
[00:18:39] Matthew: Good. Glad to hear we're back on, we're back on track-
[00:18:42] Dara: Yeah, it's these- ... with
[00:18:43] Matthew: name remembering.
[00:18:44] Dara: It's, I know.
[00:18:44] Dara: It's ne- it's, never mind the subject matter. That's all really straightforward and easy. It's just the basics like w- who are we, why are we here.
[00:18:51] Matthew: Yeah. No, I'm good. I'm good. Just been enjoying the weather and the never-ending onslaught of AI news. How about you?
[00:18:59] Dara: Yeah I [00:19:00] think we should... and here we have our episode for today.
[00:19:02] Dara: We're gonna talk about the British weather.
[00:19:04] Matthew: Yeah.
[00:19:06] Dara: I hear- Yes ... it was 49 degrees on average for about a week in the UK.
[00:19:12] Matthew: It was, yeah. It got up to a good
[00:19:14] Dara: 35- 36, I think ...
[00:19:16] Matthew: somewhere. Hot-hottest May day ever.
[00:19:17] Dara: Yeah, I think I heard 36 point something maybe was one of the... Wherever it is, it's always around is it Heathrow or somewhere?
[00:19:24] Dara: There's a, there's like a particular part on the outskirts of London that's always gets the hottest, but I think it got up to 36 or something, which is crazy. Yeah.
[00:19:34] Matthew: Yeah. 'Twas warm.
[00:19:35] Dara: Yes. But
[00:19:36] Matthew: do you have air conditioning in your little Spanish pad?
[00:19:39] Dara: Yeah, but not in every room. So this room I'm in- Oh,
[00:19:42] Matthew: no.
[00:19:43] Dara: Yeah, no, honestly, you need it though. This room I'm in-
[00:19:46] Matthew: Yeah ...
[00:19:47] Dara: is gonna, I'm gonna be baking probably from about two or three weeks time onwards, so I'm gonna need to get one of those Dyson fans or something, those, like-
[00:19:57] Matthew: Yeah ...
[00:19:57] Dara: fancy... I'm gonna have to treat myself to a fancy fan [00:20:00] 'cause this room will be like a hot box.
[00:20:03] Matthew: Good.
[00:20:04] Dara: Yeah. That's what I get. That's what I, that's what I get from moving to Spain. Anyway, people aren't here to people aren't tuning in to listen to you, Ross. Forgot
[00:20:13] Matthew: what we were doing for a minute there. I just started to
[00:20:15] Dara: just- I was overconfident saying we've finally got the basics right. It appears we don't.
[00:20:20] Dara: No, we are serious people. We're gonna talk about serious stuff. So we I think I started the last episode with a confession, and it's gonna become a theme now. But we said in the last episode, oh, we were gonna do a deep dive on Next 26, and then we decided not to because we were a little bit underwhelmed.
[00:20:36] Dara: And guess what? We've decided to do a deep dive on Next 26. But it's Next 26 and Google I/O combined, and I guess we've just had a bit of time to sit on it and think about it and actually maybe we were a bit quick or I was. Good God.
[00:20:53] Matthew: I think I/O, I think they buried the lead in I/O, 'cause I/O was only this month and there was a lot more [00:21:00] meaty stuff in I/O, including a new model.
[00:21:03] Matthew: So combined with Next 26, I think it's worth going through at this point. But Next 26 was a bit thin on the ground. I/O's, I/O's where the money is.
[00:21:11] Dara: Yeah. Yeah, you're right. Yeah you're right. I was gonna be hard on us, but you're right. It was really I/O that made the difference. But I think the two together, there's a bit of a link between them and it's, anyway let's get into it. But we maybe one, one news item only, I think, which at the time of recording, it's gonna be old news by the time this comes out. But Opus 4.8 came out yesterday at the time of recording, so Thursday the 28th of May. So it's brand new and it's a, in their own words, it's a modest improvement.
[00:21:46] Matthew: Yeah, it seems yeah it they've... I still, I'd love to know exactly what it is, what the difference is between the point models. What is it they're doing that that, that changes it? 'Cause I've always had the theory that it's primarily post-processing, [00:22:00] post-training tweaks and stuff like that, that are getting more out of the model and the big number leaps, like the, of the solid numbers.
[00:22:07] Matthew: What do you call them? Whole numbers.
[00:22:10] Dara: It's not like we're data people or number people or anything.
[00:22:14] Matthew: No, so not talking about the liquid or gaseous numbers. We're talking about the solid numbers. When that moves up, I assume that's when they've retrained the whole new parameter set on X amount of compute.
[00:22:24] Dara: I guess so too, 'cause otherwise you... The, given the cost it takes to train the models, they're hardly running through that whole process and coming out with one of these little incremental increases. You can't imagine it.
[00:22:35] Matthew: No, but they c- I think it sounds like they've tweaked the behavior a little bit, so they- Th- they said they tried to make it more honest, so they've released some benchmarks around how they've lowered the misaligned behavior.
[00:22:50] Matthew: So they have a score of one to 10 where Opus 4.7 scored like a 2.5 of misaligned behavior out of 10. Opus [00:23:00] 4.8 scores like a 1.8 out of 10. Really gripping statistics. More on, more online with Mythos 'cause Mythos who knows?
[00:23:11] Dara: Who knows?
[00:23:11] Matthew: They won't let us have it. Bastards.
[00:23:14] Dara: Mighty, Mi- Mighty Hoss is how I pronounce it.
[00:23:17] Matthew: Mighty Hoss, yeah. Mighty Hoss is like a one point whatever. Anyway, it's supposed to be, it's supposed to be more aligned, less trying to convince you of incorrect blah, blah, blah. It has gone up on the benchmarks, so it's improved on Opus 4.7 in agentic coding, in multidisciplinary reasoning, blah, blah, blah.
[00:23:39] Matthew: One interesting thing I did notice on that chart that they released on the announcement, they had Opus 4.8, Opus 4.7, ChatGPT 5.5, and Gemini 3.1 Pro. I was interested they didn't put 3.5 in, which obviously, spoilers, 3.5's out for Gemini.
[00:23:59] Dara: Yeah.
[00:23:59] Matthew: I [00:24:00] don't know if that's because it doesn't compete as well or they didn't have the time.
[00:24:03] Matthew: They'd done
[00:24:03] Dara: the work, they'd done the work before they heard about 3.5 and thought, "Ah, let's not rerun it."
[00:24:08] Matthew: Yeah. Yeah, you don't know if these are just reaction... It does seem a lot of the release times for these models seem very conspicuous, don't they? As what Google doesn't seem to join in that sort of scrambly race, but Gemini OpenAI and Claude do seem to suddenly have announcements after somebody else has.
[00:24:27] Matthew: Gemini just Google and Gemini just sit back and release when they feel like it, it feels.
[00:24:31] Dara: But they're like the, they're like the older kids that doesn't need to get involved in the little squabbles. They're there and, if things really get, things get messy, they might get involved, but they can just kinda sit back and say, "Look you guys squabble for the scraps.
[00:24:43] Dara: We're, we're-"
[00:24:43] Matthew: Yeah, they don't need an IPO.
[00:24:45] Dara: No. No, exactly. They can, I know you said last, on the last episode, you said about reading in that Dem- Demis Hassabis book.
[00:24:53] Matthew: Yeah.
[00:24:53] Dara: That they actually were scrambling a little bit. But even still, I think you can when you're Google, you do have that luxury of being able to [00:25:00] sit back a little bit, don't you?
[00:25:01] Dara: And not immediately panic and do anything knee-jerk.
[00:25:06] Matthew: No, it seems as though they w- They're all striving for a similar thing at this point, which is they have the core intelligence transformer model type intelligence stuff, and they're trying to research and invent new technologies to plug in gaps to move that towards AGI, whether that's like long-term memory or learning on the fly, these kind of holes in the, holes in these models that mean it isn't, true AGI, can't learn new skills.
[00:25:33] Matthew: It's based on its past-
[00:25:35] Dara: Yeah ...
[00:25:35] Matthew: training stuff. So it seems like they're all just striving towards that, and they're just productizing along the way to make money. But it, but thinking about it, having just said it, you can imagine that OpenAI and Anthropic are both striving towards IPOs, and a damaging model release around about the time they're trying to do it would greatly affect their their valuation.
[00:25:58] Dara: Yeah. And speaking [00:26:00] of valuations, actually, did you see that Anthropic, their recent, their most recent valuation puts them above OpenAI?
[00:26:08] Matthew: Yeah, it's-- But I've seen a lot of that stuff floating around 'cause it is... Th- like I think I said last time, like their growth is-
[00:26:14] Dara: Exponential ...
[00:26:15] Matthew: yeah. It's absolutely i- insane.
[00:26:17] Matthew: And they... I've seen a lot of stuff floating around on the web of like it shows the Anthropic's valuation and Samsung's valuation and then the revenue of both of those companies. And obviously Anthropic like 20 billion revenue, Samsung 280 billion revenue, and they both have the same valuation.
[00:26:37] Dara: It's nuts, isn't it?
[00:26:38] Matthew: Yeah. It's a bit of a narrow look at it. I suppose it's the potential of the technology, but-
[00:26:43] Dara: Yeah. Yeah. It's probably somewhere. It's like with a lot of things, isn't it? It's probably somewhere in the middle. Like these things get so inflated because of the hype, but at the same time, you can't just look at it based on revenue either because it's not, with a b- type of business like that, you've got to consider the future potential of it.
[00:26:58] Dara: So it's probably, [00:27:00] yeah, the true valuation probably sits somewhere in the middle. But, I think Samsung are gonna be the secret. I think they're gonna be the dark horse in all this. I think they're gonna come out with the, the best- As
[00:27:09] Matthew: AGI. Yeah.
[00:27:10] Dara: Straight
[00:27:10] Matthew: out the bat.
[00:27:11] Dara: They're just gonna, "Listen we just, we did it while you were all out there talking about it."
[00:27:14] Dara: Yeah. "We just, we solved it."
[00:27:16] Matthew: It'd be funny if McDonald's just released AGI or something. Just some out left field. "Surprise, we've actually been running data centers in all of our restaurants around the world," and- "
[00:27:27] Dara: We were tr- really trying to crack the perfect Big Mac recipe, and we threw everything we had at it, and we ended up accidentally creating AGI."
[00:27:35] Matthew: I'm gonna put that, I'm gonna put a bet on that later. I wonder what the odds are.
[00:27:38] Dara: No, it's just gonna be some, just some person, some random person sat in a little hut in Siberia or something. They're just gonna, they're just a genius, and they're just gonna basically just decide, "Oh, I think I'm just gonna create something today," and they're just gonna make AGI.
[00:27:51] Dara: Yeah. It
[00:27:53] Matthew: does feel like there's pu- there's puzzle pieces there that people could take current te- It's OpenClua a little bit, [00:28:00] where somebody just put a few pieces of things together and created something new that was outside of these frontier model labs. You can imagine more and more... And DeepSeek a little bit going on in China who's releasing these open source models that are cheaper but incredible in power.
[00:28:17] Matthew: It's these sort of... Doesn't necessarily have to be a frontier model lab that puts a couple of puzzle pieces together and gets there. They're most likely, obviously, but-
[00:28:25] Dara: Yeah. No, you're right. It doesn't have to be. It makes me think as well of that, The, the-- I know we're always quoting this book, but the, "If anyone builds it, everyone dies" book.
[00:28:33] Dara: Cheery. Yeah. Cheery title. One for the kids. What, the scenario it plays out where it talks about how this, that wasn't about AGI as such, but it was about how AI could get out of control, and it was saying about how it would happen, it would chip away and it would gradually evolve itself.
[00:28:49] Dara: And y-y-you kinda have to think that if AGI is possible, it's c- it, then it's inevitable because the models themselves will just keep iterating. At some point, there's gonna be that [00:29:00] turning point, isn't there? Where po- it may-maybe it's mythos, they'll run all the tests they can, they'll think it's safe and secure, they'll let it out of the box, and it'll be like, ha Fools you Yeah.
[00:29:12] Matthew: Maybe
[00:29:13] Dara: I'm out now.
[00:29:14] Matthew: We'll get back-- We'll go back to the Anthropic announcement in a minute 'cause I'm on- we're on one here. But I, on the s- on the subject of the book, I've just nearly finished another book, which is positive.
[00:29:25] Dara: Oh, I'm not interested in that one then. Can we move on?
[00:29:29] Matthew: It's the you'll probably recognize it.
[00:29:30] Matthew: It's called Ray Kurzweil, K- Kurzweil. But it, there there's a book called "The Singularity Is Nearer," which is a follow-up to Is Near." But he's much more about laying out how we get to the un- how we get to "Star Trek," essentially. So if you need a palate cleanser, it's worth a read to have some positivity of what the hell we're all doing striving for this stuff in the first place.
[00:29:50] Dara: Maybe. On one of my darker days, I might read it, but I do lean towards the doom and gloom.
[00:29:55] Matthew: Yeah. Yeah. The pretty much... I've, I read The Coming Wave," everyone-- "If Everyone Builds It, Everyone [00:30:00] Dies." Th- all of them are very down on things. Anyway, so also Anthropic also basically with 4.8, which is the same price, which is interesting as well.
[00:30:10] Matthew: They've not bumped the price at all, but they also released like three-
[00:30:14] Dara: Yep ...
[00:30:15] Matthew: new functionalities along with it, with with Claude Code. So dynamic workflows, which it just kicked off. I was just doing, we were doing some research for this show and I asked it to just summarize some stuff from IO and thingy, and it kicked off dynamic workflows for me, which is, it can from what I can tell, from what I've read, it can run agents for a long time, and it can scale agents up to hundreds of agents.
[00:30:43] Matthew: And it's able to take in almost entire code bases and run things across these entire code bases, entire migrations, all this sort of stuff. So much larger workflows be able to be completed with it. But obviously with that comes a lot greater [00:31:00] cost from a token perspective. I I've just, I've built a little app yesterday that shows me my token percentage 'cause some stuff's running in the background.
[00:31:07] Matthew: I it ran earlier and took about half of my token percentage on researching IO and,
[00:31:13] Dara: Do you know what? I'm, I miss the recent days, it was about two weeks ago, three weeks ago, when I was running through the tokens quite regularly And complaining about it all the time. And then they bumped the limits, and I feel a bit like I'm not working hard enough now.
[00:31:27] Dara: I'm like, if I'm not hitting the limits two or three times a
[00:31:29] Matthew: day- Very
[00:31:30] Dara: competition ... so I think I'm gonna set it off. I'm gonna ask it to create AGI and just see how many tokens that uses.
[00:31:36] Matthew: If everyone in the world right now just does that-
[00:31:38] Dara: Yeah, they'll blow up the- It's a
[00:31:39] Matthew: one, one GitHub repo.
[00:31:41] Dara: Yeah. But yeah, so there w- I didn't realize it was in, so did you do that in Cowork, that research? Is dynamic workflows. I thought it was only in Claude Code. I mean-
[00:31:50] Matthew: No, I did it in, I did it in Claude Code.
[00:31:52] Dara: Yeah.
[00:31:52] Matthew: Yeah. Yeah, 'cause I was messing around with... this is again a bit of an aside, but I was messing around with a different terminal because I wanted a nicer way to [00:32:00] have all my different code bases running.
[00:32:02] Matthew: So I just ran it in there and it's enabled by default, so it, it asks you, "I can do this dy- I can do this dynamic workflow, but it will be expensive if you want me to do it." So I probably a bit of a waste to do that on-
[00:32:14] Dara: Ah, gotta try it
[00:32:16] Matthew: out ... on Singe, but yeah. So yeah, it's that. They've they've got something called effort control, which I thought this is in Cowork and Claude AI.
[00:32:26] Matthew: I thought they already had this, but maybe they only had it in Cowork. But you can essentially say how much effort you want the model to put in, which i- I think controls its token allocation.
[00:32:36] Dara: I also thought it existed because there was some hoo-ha about it, wasn't there? Because they changed it. So it ha- it, something-
[00:32:43] Matthew: Yeah
[00:32:44] Dara: either they-
[00:32:44] Matthew: They dropped it down.
[00:32:45] Dara: Yeah. So it did exist before,
[00:32:47] Matthew: but it does... I use it quite a lot in c- in Claude Code. If I, if it's stuck on something, I'll... If you type ultrathink it'll do a little rainbow animation and it it s- puts a lot more effort into it. And I think [00:33:00] the standard is X high, extra high, but then you can go to ultra.
[00:33:04] Matthew: I think with this, they've dropped it to high, but they say it's comparable to extra high for 4.7, and you can com- you can change it based on your needs, and you can... So you can, if you've got a really, you're really stuck on a problem, you can push it up to max, use more tokens, but get it to solve the problem.
[00:33:21] Matthew: I think 'cause they've realized getting it to think for longer with more resource t- has a, just ha- has that benefit. So that's there. And then a smaller thing is like you can now... It has annoyed me for a while, this actually, but you can now send messages to Claude when it's thinking, when it's working.
[00:33:43] Dara: Yeah.
[00:33:44] Matthew: And it'll take that in.
[00:33:46] Dara: That bugged me too. I didn't actually pick up on the fact that they'd change, changed that as part of this release. Because it even said, I think, it changes a lot, doesn't it? But I'm sure it said the w- the wording on the button said, "Queue [00:34:00] up message," and you'd send one and it would ignore it, and you'd have to send it again, and it's really annoying.
[00:34:06] Matthew: Yeah, so it says you can basically update Claude's instructions mid-task without breaking the prompt cache or routing or anything, yeah.
[00:34:13] Dara: Good. Yeah. 'Cause 'cause l-
[00:34:16] Matthew: I've just got into the habit of just stopping whenever it's... If I've had another thought- I've
[00:34:19] Dara: been doing the same ...
[00:34:20] Matthew: seeing it doing, I just stop it.
[00:34:21] Matthew: But if I could just say, "Wait a minute, consider this," that'd be better.
[00:34:25] Dara: I'm glad to know that 'cause even as of this morning, I've been doing exactly that. I hit stop. Which probably isn't great 'cause it's you, it's... Maybe when you're sending the follow-up prompt, it's probably more token heavy doing that, I would imagine.
[00:34:36] Dara: Maybe not, but.
[00:34:38] Matthew: I was reading a bit about... There, there was a video I was watching on on caching yesterday, which was interesting as well. It was another aside, but apparently it's an hour standard, and if you let it lapse over an hour or the it recaches absolutely everything. So sometimes if you've gone over an hour talking to it or you've left it sitting for a long time, it's better just to start again [00:35:00] if you think you can.
[00:35:01] Matthew: Otherwise, it'll go back through your entire conversation, recache it all, and gobble up a load of tokens immediately. So it's like practice to start new sessions and think about how long you've been talking to it and not leave something sitting for too long before you come back to, to carry on.
[00:35:17] Dara: But did you see, this was before 4.8, but did you see they've introduced this, like it comes up with a little message saying, "This chat is X number of hours old.
[00:35:24] Dara: Do you want to start from it?" It's like a forced compaction, I think. Have you seen that?
[00:35:29] Matthew: I don't know if I have actually
[00:35:30] Dara: It's only in the last, I don't know, the last week maybe, but before 4.8 came out. But it's... Yeah, if you've got, if you've got a session sitting there, this is in co-work may- maybe in chat as well, I'm not sure.
[00:35:42] Dara: But I obviously ignore it and just say, "No, I want y- give me the full, give me the full history." But I think it is ba- it basically will just force it to compact.
[00:35:51] Matthew: Yeah and I've got bad, I've got bad habits. Yes, th- so compacting regularly is good, but I've got bad habits of just picking up old conversations, [00:36:00] and I'll just let it run and run.
[00:36:02] Matthew: But in reality, I'm probably absolutely destroying my my tokens by doing that because it's recompu- it's recaching and-
[00:36:09] Dara: To- yeah, tokens and also the quality. I finally moved-
[00:36:13] Matthew: Yeah ...
[00:36:13] Dara: on from one yesterday. I had a session running for probably weeks and weeks, and h- honestly, it was getting so bad. It was like, I was...
[00:36:20] Dara: And I was- Yep ... I don't know why I keep doing this, but I was like, I kept getting more and more annoyed with it saying, "Why are you c- why are you making so many mistakes?" And then I was like- Oh, you
[00:36:30] Matthew: remember every fact I've ever told you ... "
[00:36:32] Dara: What's the matter with you?" So eventually I was like, "Oh, okay." The penny dropped, and I was like, "I think I know the problem here."
[00:36:37] Dara: It was cr- bursting at the seams, I think, trying to keep all the context. So I just started a new session, and it was like, it was flying. So yeah, there's a bit of kind of hygiene to be done from time to time, isn't there?
[00:36:48] Matthew: Yeah, I've seen th- there's a, there's some skills out there that, that are like handover skills.
[00:36:53] Matthew: So you can run that, it'll grab everything, put together a handover doc, and then clear the session and pass that handover doc to the [00:37:00] next session. So if you feel like you're at a rough break point, you can do that and kick off again. So I need to get into the habit.
[00:37:06] Dara: Yeah. Same. Yeah. Same.
[00:37:08] Matthew: Or maybe t- maybe in a week they'll come out and say, "Oh, we've got a trillion context window."
[00:37:11] Matthew: And
[00:37:13] Dara: you're like all right, fine." And I just for- forget about... Yeah, exactly. That's where it's headed, isn't it? All right. So I think we've probably meandered enough. How much news was in that news section?
[00:37:21] Matthew: Like one thing and a long conversation about air conditioning and-
[00:37:27] Dara: A- and AI, the usual doom and gloom, A- AI taking over and, destroying us all.
[00:37:32] Dara: All right, so onto... so yeah, so we decided we will do a bit of a deep dive on Next '26 and Google I/O, which happened within... Funny that they do this, isn't it? Like they happened within five weeks or something.
[00:37:44] Matthew: Yeah. April was Google Cloud Next, and then I/O was in May, I think 19th of May, so not long ago.
[00:37:50] Dara: No, 10, 10 days ago at the time we're recording this. So yeah, they do them pretty much back to back, and I think that, as you said earlier, they I/O was really the more, probably the more interesting [00:38:00] of the two
[00:38:00] Matthew: Yeah. We can start with, we can start with end of Google Next, 'cause we briefly mentioned them, but not gone into any massive detail, but we maybe will skim over them again, I don't know.
[00:38:11] Matthew: So there was this rebrand and expansion of Vertex AI, so they've moved it away from being called Vertex AI now, which it has... it's been called Vertex for a long time. Like when it was before LLMs and before the transformer sort of revolution, it was called Vertex. So I guess this kinda makes sense.
[00:38:31] Matthew: But so they've now called it Gemini Enterprise Agent platform. Much catchier.
[00:38:38] Dara: Yeah. I hon- honestly I don't know. Do you think they use Gemini to come up with the names? Or they have- Maybe ... they've got probably a team of people that their job is to come up with these names that are just do not roll off the tongue at all.
[00:38:51] Matthew: No. No. And pretty much everything I'd be surprised if Six Twelve on Saturday there's anything that isn't called Gemini. We've got the running theory that [00:39:00] Google's gonna rename themselves Gemini at some point maybe not Alphabet, but Gemini. Google becomes Gemini.
[00:39:06] Dara: Google becomes Gemini, yeah.
[00:39:07] Dara: Yeah. But yeah, so they changed, and then they mer- they merged Agentspace in with it, right?
[00:39:12] Matthew: Agentspace. Is it Agentspace or is it Gemini Enterprise?
[00:39:15] Dara: Agentspace became something else, didn't it? And now it's all merged in. It's so confusing. Yeah. Agentspace was what they called it at Next '25, I think, wasn't it?
[00:39:25] Dara: And then they renamed that to Gemini Enter- or it became part of Gemini Enterprise. Yeah. So that's all rolled up under one.
[00:39:34] Matthew: Yeah. Yeah. They said they said it, it's that, and they the... in expanding it, they're talking about, they're moving away from like a call and response type thing for a lot of the AI functionality that's in there to having like agents being the core model and deployable aspect of everything.
[00:39:52] Dara: And long-running agents, I think, as well. I think a lot of the updates seem to be around putting in place what they need for these long, [00:40:00] long-running agents.
[00:40:01] Matthew: Yeah, and a lot of stuff at an enterprise level about governance, monitoring, observability. You don't need any of that 'cause SEAM exists.
[00:40:10] Matthew: You can just use SEAM to do all of that, but-
[00:40:12] Dara: Yeah. Yeah ...
[00:40:13] Matthew: Google's having a go as well,
[00:40:14] Dara: yeah. Oh good on them. They're, you know- Yeah ... trying. Yeah.
[00:40:17] Matthew: Yeah. So yeah a typical sort of enterprise play at really productizing productionizing all of these sort of experiments and agents and all the rest of it that have been, for the past few years they've been experiments and tinkering, and they're now trying to very much put it into place.
[00:40:34] Matthew: This is how you deploy and actually use and get benefit out of these technologies. So that was one thing. And then the TPUs, eighth generation TPUs. I never know what to say about these other than-
[00:40:51] Dara: No ...
[00:40:52] Matthew: they are, they're better.
[00:40:54] Dara: They're better. They have-
[00:40:55] Matthew: Significantly better ...
[00:40:55] Dara: they have all these weird they're probably not metrics, they've got all these weird, i'll have to look [00:41:00] them up now, but they've, they, like, all these things that they talk about, the qualities of them and they're all these... They sound like made-up words. So every time... 'Cause I remember us talking about it last year as well, and it was like, "Oh, the seventh generation has got so many flux capacitors versus the previous one."
[00:41:14] Dara: It's like all these things. It sounds like- Yeah ... are they taking the piss? Are they r- are they really just making these things up and see if anybody knows?
[00:41:20] Matthew: Yeah. Flop- floppy slices and-
[00:41:23] Dara: Floppy, that's it. Something to do with flop- flops.
[00:41:25] Matthew: Teraflops.
[00:41:26] Dara: Teraflops yeah. Yeah. They just sound comical but but yeah- Yeah
[00:41:29] Dara: I'm the same. They're more powerful. It's better. It's an upgrade.
[00:41:33] Matthew: One thing I did hear, I don't know if this is straight bit of salty, shook a bit of salty news in there. I'm pretty sure they... It's either something, it's connected to the TPUs or some other piece of sort of innovation they've created, but it allows them to link up their data centers their global data centers to train models.
[00:41:55] Matthew: I think the idea being that rather than having to have one giant data center on [00:42:00] which to to, to train all the compute, they can link them all 'cause they've got such high throughput, crazy networking across the world. They can hook them together and create these huge, even bigger data centers than anyone's ever seen.
[00:42:13] Matthew: I have no source or knowledge of where I heard that, but I think I did.
[00:42:18] Dara: I was just gonna say that's possibly the saltiest bit of news we've we've ever had.
[00:42:23] Matthew: I like people to do their own homework. Just go and fact- see if you can Google that.
[00:42:27] Dara: Yeah. We throw stuff out there. It may be true, it might be completely fabricated.
[00:42:30] Dara: It's up to you to figure it out.
[00:42:32] Matthew: Yeah. That's... We- what we're doing is training people's critical thinking ultimately. That's the aim of the podcast. Are we talking nonsense or are we telling the truth?
[00:42:40] Dara: No I've heard it too. I think it's called Skynet. I think that's what they call it.
[00:42:44] Matthew: It does- yeah, that sounds like a leap towards that, doesn't it? But yeah, no, I'm sure I heard that somewhere, but who knows? They also released this agentic data cloud, so that's Yeah ... allowing it to, allowing [00:43:00] them to call some enterprise SaaS apps- Yep ... directly from Google or define what it, define what is in there and opening things out.
[00:43:11] Matthew: It's it's like, apologies for people who didn't listen to last podcast and don't know what SIEM is after we talked about it last week. But it's, in a way, it's like defining the model and going to the source and pulling the information back rather than having to pipeline and move it. So it's it's an adv- it's an e- evolution of the open cloud stuff where you could have BigQuery and you could call from Azure and call information back from Azure into BigQuery and process it there.
[00:43:38] Matthew: It seems like an extension of that, but for being able to add s- semantics, et cetera as part of it.
[00:43:45] Dara: Is that cross-cloud lakehouse, is that new or-
[00:43:50] Matthew: The cross-cloud lakehouse I don't think necessarily is, or at least the concept isn't, because we've done it before with clients where we've...
[00:43:59] Matthew: They've got [00:44:00] some data sitting over in Azure or Dynamics or something like that, and we don't necessarily wanna move it again. But because we've got, say, GA4 data or some other data sitting in BigQuery where we wanna do some analysis, we'll just set up a connection to Azure and just be able to query the data in place.
[00:44:21] Matthew: So you use BigQuery's qui- query budget, but you leave the data stored where it is. So there's been this open cloud concept for a while. I just don't know if it's the layering in of the semantics and company knowledge to that, which is what's making it open. Gemini what's it called? Agen-agentic data cloud.
[00:44:38] Matthew: Agentic
[00:44:40] Dara: data cloud. Yeah, it's honestly, don't... I think if somebody passed a test on what are all of these things called, they would be a savant.
[00:44:48] Matthew: Yeah. Letting agents answer questions and take actions grounded in company's real data rather than just their training knowledge, analytics, reporting, and decision support agents while keeping data governance intact, [00:45:00] avoiding costly migra- data migration between clouds.
[00:45:02] Dara: Ah.
[00:45:03] Matthew: Yeah.
[00:45:03] Dara: Obviously.
[00:45:04] Matthew: There you go.
[00:45:05] Dara: Look, this is prob- at the risk of derailing us a little bit, like the... I think this was a bit of subtext and we are biased, but if the headline of the two events is like agents are the future, I think the subplot is definitely the governance that needs to sit, the grounding that needs to sit underneath them to make them useful.
[00:45:24] Dara: Which is good news for us given where SIEM is heading because SIEM effectively is that layer that can provide that context. But reading through it, there was definitely a lot of, It, it wasn't the headline, but within most of the updates, there was talk of something around semantics or governance or grounding, because if they are pushing agents, those agents are gonna be absolutely useless if they don't have some kind of proper context.
[00:45:51] Dara: The interesting thing I thought though is, and this makes sense for Google, given they're a huge technology company, is they're trying to AI-ify it all, [00:46:00] and it... and what there isn't is that kind of human focus where a lot of those governance decisions need to be made by humans. So Google are putting a lot of stuff in place that will let you, read documents and pull together AI-generated semantics.
[00:46:12] Dara: But there's still that gap where humans need to figure out what should the rules be in the first place.
[00:46:19] Matthew: Yeah. And it's all very much in, in the warehouse in Google's... in Google Cloud platform, you have to define all this stuff. You still have to be beholden to that. It's not the most accessible thing.
[00:46:28] Dara: No, that's... No, and you're locked in as well. It's not maybe as portable. But anyway, so-
[00:46:33] Matthew: It's funny, isn't it? 'Cause it's like open in that you can retrieve information from other places, but you're very much, you have to do it and build it and create it all within Google Cloud Platform.
[00:46:43] Dara: It's a weird, yeah, it's a weird thing, isn't it?
[00:46:45] Dara: It's like how, yeah, like they, they could... it's not viable for them to ever make anything completely open. It's oh, you're open to use other... It is like open in their favor. It's we'll let you query other [00:47:00] sources and, and pull in data from other places, but you do the work in our backyard.
[00:47:05] Matthew: Yeah. In our yard.
[00:47:07] Dara: In our yard. Which is, fair enough. They're a business at the end of the day.
[00:47:11] Matthew: Yeah. Yeah. Exactly. Yeah. But yeah, it's interesting. It's interesting just on a macro level where it's all going. There's, there is a lot of, now a lot of... I think since people have seen SEEM, there's a lot of talk around governance and that kind of stuff I, I jest of course, but yes, definitely where it seems to be going at the minute.
[00:47:31] Matthew: People, people's next... There, there's probably like a list of gripes people have with AI, and they've ticked a couple off "Oh, it's not very good." It is now. "Oh, it's not it's hard to deploy." It is now. It's easy now. And now it's like governance and the next steps that need to the problems need to be solved.
[00:47:47] Dara: Yeah.
[00:47:47] Matthew: And that's where they're heading.
[00:47:48] Dara: So onto Google I/O, which only happens just over a week ago.
[00:47:53] Matthew: Yeah. The big one was 3.5.
[00:47:56] Dara: Yeah.
[00:47:57] Matthew: That's their next generation of models, [00:48:00] but they're not... yeah, generation. It's, I, again, I'm confused by the wording still.
[00:48:05] Dara: You can't really say it's a generation, can you?
[00:48:07] Dara: But if it's a, it's like an iteration. But the... I can't remember if you said this at the top or not, I think you might have done, or if it's in our notes, but it's not the smartest model, and almost by design. It's, they're going for it's faster and cheaper, and it's- it's
[00:48:22] Matthew: Flash is what they released first, wasn't it?
[00:48:23] Matthew: Yeah.
[00:48:24] Dara: Yeah. Yeah.
[00:48:25] Matthew: But I did notice, I was in Gemini yesterday, and there is now a 3.5 thinking in there. Ah,
[00:48:32] Dara: so maybe that's comparable with the, the likes of Opus or ChatGPT-5.
[00:48:38] Matthew: I don't know, 'cause it's, they are, it is confusing 'cause there's also, it, they had 3.1 Pro.
[00:48:45] Dara: Yeah.
[00:48:46] Matthew: I think, I think- Claude got the name, has got the naming conventions right.
[00:48:53] Matthew: It's qu- it's quite clear. You've got three levels soon to be four by the sounds of it, but you've got three levels, and you've [00:49:00] got the model number. Opus is the big, heavy-
[00:49:03] Dara: The premium ...
[00:49:04] Matthew: whatever. Yeah. It just feels easier just to go Haiku for... I never use Haiku, but I probably should in times.
[00:49:11] Matthew: But yeah, Gemini 3.5 Flash and then 3.5 Thinking. Is Thinking the pro or is it not the pro, or is that just the standard model, or is that Flash with Thinking?
[00:49:21] Dara: Yeah.
[00:49:22] Matthew: I've not seen anything about all the models outside of the Flash coming out. N-
[00:49:28] Dara: no, and that's why I, and maybe I was thinking too narrowly, but it was fitting the narrative a bit, thinking that the that their focus on speed and cost is supporting this kind of agent-focused future they're betting on.
[00:49:43] Dara: Yeah. That it's gonna be... N- not necessarily saying that it's less about the model, but that the model is driving the kind of agent behavior.
[00:49:50] Matthew: Yeah. That's their positioning was like, it's... Flash 3.5's an engine for agents rather than an engine for chatbots, because an [00:50:00] agent's doing lots of tasks, and multiple model calls and loops.
[00:50:05] Matthew: The quicker it can work through those tasks and reduce the latency, the more that it's gonna feel the better it'll feel, the more financially viable it is, et cetera. If you've got big, cumbersome models working through those things, it's more expensive, slower and less good. I think that's their- Yeah
[00:50:24] Matthew: that, that's their idea there. So yeah, high volume, low latency, but with some intelligence. I'm assuming that 3.5, other 3.5 models will be incoming if they aren't there already.
[00:50:35] Dara: Yeah. Yeah. Yep. I think so too.
[00:50:37] Matthew: That's... So I was gonna say another, essentially another agent. Agents. It's all agents. It's all that Google are talking about at the minute is agents, agentic, agents, agentic.
[00:50:46] Matthew: They really didn't talk much about the classical chat sort of interface. They very much are in that world. '
[00:50:53] Dara: Cause it's all enterprise, isn't it? And that's the, I guess that's the bet they're making is that they're not, maybe not going [00:51:00] after the end, the, the Joe Bloggs end user that maybe OpenAI and Anthropic, you could argue they're starting to focus a lot more on enterprise as well.
[00:51:11] Dara: But I think with Google, they're really just, that their big bet is enterprise. They're gone up against, Microsoft and-
[00:51:17] Matthew: Yeah, no, actually, you're right. I was gonna say, I think they're much more, Anthropic are much more in the
[00:51:24] Matthew: devel- developer world, but CoWork is a player. It's a player. It depends what we mean by enterprise, google are much more about, it seems like a lot of what they're doing is about provisioning and building and they're flexing their cloud muscles to help people provision and own and do this on an enterprise level.
[00:51:39] Matthew: Anthropic is much more been around like services that they provide and models they provide and although you can host them on Google Cloud and stuff, but- Yep. Yeah. Yeah. And the next one is interesting actually. The rest was boring. This one-
[00:51:54] Dara: Yeah. Yeah. Let's get to the get to the interesting ones.
[00:51:57] Matthew: Gemini Omni.
[00:51:59] Dara: Yeah. [00:52:00]
[00:52:01] Matthew: Yeah. Yeah.
[00:52:02] Dara: Any to any. Multimodal. Anything in, anything out.
[00:52:07] Matthew: Which I thought- was already the case, but I don't know if it's just a much a, they've cracked it. I don't know. But th- there's some demos I've seen floating around. I tried to use it the other day, but I guess, you can guess what happened.
[00:52:21] Matthew: It wasn't available in the UK.
[00:52:24] Dara: Same. I thought I had used it and then realized afterwards, 'cause y- you can click through to it, but it's just Veo. It's not... Yeah. And I thought I had used it but it's-- what part of that little exercise was seeing that Veo is better than it was the last time I used it, because I did think it was Omni.
[00:52:40] Dara: I was like, "Oh, this is much better." And then I was like, "It isn't even Omni."
[00:52:43] Matthew: No. Yeah, I think they have bumped the mo- the image models and the video models as well. But yeah, Omni the use cases I-- use cases, examples I saw was, like, people... Or I have seen on YouTube as well, people walking around taking a video, [00:53:00] and then they pass it into Omni and just get variations of somebody's just talking and saying, "Look, I can put any object in my hands," and then they get Veo to drop a dinosaur bone, and sorry, Omni to drop a dinosaur bone in their hands and just manipulate everything they want in video form.
[00:53:17] Matthew: Which is cool. It's really, the, it's really interesting to see, like, how this would be, actually be used in, like-
[00:53:25] Dara: Creatives ...
[00:53:26] Matthew: yeah, like CGI and j- it, some of the visuals I've seen, like somebody, I don't know if this was Omni or if it was a combination of Nano Banana and some video models, but people cha- changing their arm into water and moving it around, and it looked pretty convincing.
[00:53:41] Matthew: You can imagine it being a really powerful tool for creatives and s- and CGI folks.
[00:53:47] Dara: Definitely,
[00:53:47] Matthew: if they adopt it.
[00:53:48] Dara: If they adopt it, yeah. And without getting too philosophical about it, I think if people embrace it, it's like with everything, it's not at first people are like, "Oh no, it's gonna, it's gonna t- you know, take away what we bring [00:54:00] to it."
[00:54:00] Dara: But I think if you're open to it and you can apply your creativity to it, it gets rid of, it's like with everything, it gets rid of some of the grunt work. In a lot of video editing and image editing a lot of it's not glamorous, and to be able to get rid of some of that.
[00:54:12] Dara: Like
[00:54:12] Matthew: storyboarding I can imagine would you'd be able to literally make y- yeah, film the actor and just put a couple of ideas of what this might look like and yeah it
[00:54:23] Dara: seems like- Yeah, the storyboard would be almost the end, it'd almost be the end product. Yeah.
[00:54:26] Dara: Yeah.
[00:54:26] Matthew: Yeah, maybe it's too much power.
[00:54:28] Dara: And to be able to explore so many different options that would be costly otherwise. Imagine if you want to like, do we set this on the moon or do we set it in the Sahara Desert? It's you could d- you could do, you could have have so many different variations of it.
[00:54:40] Dara: It's a
[00:54:40] Matthew: common problem.
[00:54:41] Dara: It is, yeah. Yeah. But anyway, I guess-
[00:54:43] Matthew: Yeah. "Lawrence of Arabia" had
[00:54:44] Dara: that
[00:54:45] Matthew: same problem, I suppose-
[00:54:45] Dara: If only they had advanced AI models back then.
[00:54:48] Matthew: Yeah. Yeah.
[00:54:49] Dara: But yeah, I think it's that, that I guess that's the kind of practical use case for Omni in the real world, and it is just, it's insane how good it's getting.
[00:54:58] Matthew: I think one, one [00:55:00] big play that I can see coming here for Google, and I think one of the things that Omni is aiming at is real-time assistance. So they've announced like their glasses and all of this other stuff, and this complete multimodality where it can see a picture, understand text, see what's happening around it, and and interpolate it, and listen to speech and understand that, and create things off the back of it.
[00:55:23] Matthew: Couple that with, say, 3.5 or some lighter model that's quick and able to process this stuff, and you can-- You're starting to get towards like real-time assistance that can do live translation and visual help and maybe even begin to augment your reality a little bit, add new things into where you're looking and what you're seeing.
[00:55:44] Matthew: I think that's where they're heading with it or what's that, their, that's their play, which is cool.
[00:55:48] Dara: Yeah. One of their many, yeah, one of their many plays. It's interesting, isn't it? Thinking where their, what's their eventual, what's their end goal with all of this. But anyway, that's for another day.
[00:55:57] Matthew: Yeah. Next is [00:56:00] Antigravity 2.0 So that's their new version.
[00:56:06] Dara: Yes, it does new things, I believe.
[00:56:09] Matthew: Yeah. I did spin it up and use it, but I didn't finish what I was doing, so I haven't got much concrete to say about it.
[00:56:18] Dara: No, I think my take, and tell me if you've got a different take, my take is if you're it's, it is again part of this same theme of like very agent-driven, so they've added a lot more capability in for, in, in terms of like multi-agent support.
[00:56:33] Dara: And it depends what you're working on, doesn't it? I kinda think if you're com- if you're working in VS Code now, you'd only really move to Antigravity if you... I think one of the big reasons would be if you are doing stuff with, multi-agent. Otherwise it's just, it's much of a muchness.
[00:56:50] Dara: I guess what it can do is it can support multiple models as well, so maybe that would be an interesting use for it, 'cause I don't know, can you do that in Claude Code? Can you [00:57:00] run models outside of Anthropic in Claude Code?
[00:57:03] Matthew: No, I don't think so. I don't wanna categorically say that, but I don't think so.
[00:57:08] Matthew: I might be wrong.
[00:57:09] Dara: I'm not aware that you can, but again, yeah, I might be wrong too but yeah, otherwise Antigravity, yeah, I don't know. I,
[00:57:14] Matthew: I think it's moved... it just sounds like it's moved towards where more peop- what others are doing. So like delegating whole tasks to agents and not necessarily getting suggestions back and a- aim towards more of that autonomous, like you say, autonomous and multi-agent stuff.
[00:57:32] Matthew: The interestingly, this go- this flies in the face of our everything's Gemini, rename Google Gemini theory, because they're renaming Gemini CLI to Antigravity
[00:57:45] Dara: CLI Yeah, that's a bizarre one
[00:57:47] Matthew: Yeah, it's weird. I don't quite get that.
[00:57:50] Dara: No. Nor me.
[00:57:52] Matthew: Unless it's a, unless it's some we've got a good audience for Gemini CLI, let's change it to Antigravity and maybe people will realize Antigravity exists," [00:58:00] and then,
[00:58:00] Dara: Maybe they're throwing it a bone.
[00:58:02] Matthew: Yeah. 'Cause I don't know how many people are using it. I can't imagine it's tons.
[00:58:07] Dara: No. No. I can't imagine that.
[00:58:10] Matthew: But I'm mainly in just terminal now. I'm not even in VS Code recently. I've just been, I've downloaded a fancier terminal, and I'm using like tabs and notifications and things like that in there rather than...
[00:58:21] Matthew: So yeah. Who knows?
[00:58:23] Dara: Yeah. Yeah. We'll see on that one.
[00:58:26] Matthew: Yeah. A few other bits in there, but I think the most interesting next one is probably WebMCP.
[00:58:32] Matthew: Which it proposes like an open standard extension to MCPs. And the idea is that it's like a, the model context protocol idea, but websites.
[00:58:44] Dara: Yeah, cl- yeah. Does, it does what it says on the tin.
[00:58:47] Matthew: Yeah. So it would it would let websites declare structure capabilities or tools to the agent to call directly. So like search products or add to cart or book appointments. You, so you, as a web [00:59:00] developer, you could ex-expose these tools to the WebMCP and just make it easier for agents to service people in that way.
[00:59:07] Dara: Makes complete- ...much
[00:59:08] Matthew: easier
[00:59:09] Dara: Yeah, makes complete sense. Wasn't there something like that? But yeah, maybe it wasn't from Google, but was there not some talk... I can't remember what it was called now. There was some other proposed WebMCP type. Maybe it was G- maybe it was an original kind of version of this or something that they've now bolstered.
[00:59:25] Dara: But I'm sure there was something before similar around this. But it makes, it makes com- like complete sense. It's the way things are going.
[00:59:32] Matthew: Yeah. And there's so many... Yeah, it feels like somebody's one of these has, like you say, one of these has to be adopted because you can fight it all you want, but right now peop- more and more people are gonna be using like the Claude extension in Chrome, or they're gonna be using Playwright or the web, doing web scraping and just trying to get to the information some other way.
[00:59:54] Matthew: And peop- most people now are in some sort of LLM. They're using it on a daily basis, and they're gonna be getting [01:00:00] frustrated by the brittleness or the the flakiness of some of The techniques of retrieving this information from websites
[01:00:05] Dara: That's interesting. Do you th- do you think... I hadn't really thought about this.
[01:00:08] Dara: Do you think do you think it would be practical and do you think people would expose the content of the site itself? I was thinking it would be like database, if there's information behind-
[01:00:18] Matthew: Yeah, that's why I'm saying that with scraping, 'cause it, 'cause like-
[01:00:20] Dara: Yeah, 'cause scraping's a pain.
[01:00:22] Matthew: Yeah. Trying to get stuff out of... Like even some example recently, I was working to try and get some information out of a tool that didn't have an export, but it was showing a table on there, but then the table wasn't actually in the HTML. It was being generated and I just couldn't retrieve it.
[01:00:40] Matthew: So having some tool in there that just says, "Here's..." Maybe people don't wanna do that. Maybe the, maybe there's proprietary information that they don't want you to scrape and grab, but-
[01:00:49] Dara: But I wanna be able to do that, but I wanna do it.
[01:00:50] Matthew: Yeah, I wanna do it.
[01:00:51] Dara: Yeah. Yeah. No, that, that'd be... i...
[01:00:53] Dara: if there's some, I don't know, ma- it it... Could this lead to a way where the whole site is structured in a way [01:01:00] that, that... Because it, because that's a really common... Doing that all the time, trying to pull information just for research, usually it's go and understand this website and, feed it into this research project or whatever, and it's always coming up with issues.
[01:01:13] Dara: It's f- it's funny, isn't it? It one, it's one of the things that feels really clunky at the moment. If you're connecting to something through an MCP, it's all nice and fluid and smooth, and then you... and it goes off to scrape a website and it's oh, this didn't render and I'm hitting this there's issues with wh- what do you call it?
[01:01:29] Dara: The domain policies or whatever. And it's coming back with all these problems. So if there was some way, if the, if websites were structured in a way that was more, whether it was, I don't know, more MCP friendly or whatever.
[01:01:42] Matthew: Even on content, like content.
[01:01:44] Dara: Yeah. That's what I'm thinking even.
[01:01:45] Dara: Yeah.
[01:01:46] Matthew: I think we'd do it, right? 'Cause we're seeing more and more people come to work with Measurelab that are founders on Claude or founders on OpenAI because of some content we've written. If the world is going in that way, which it almost certainly is, why would we not [01:02:00] write authoritative content, put it on our site, and then expose it in a nice digestible way, so when someone's asking Claude, "How the hell...
[01:02:07] Matthew: what's Salty News mean?" Then we c- we've got our ar- Salty News article that can highlight and surface that to them nice and easy, and we're more likely to get there's gonna be that-- There's gonna be a... A- and I appreciate that this is a very flippant statement, and there's some businesses that absolutely their whole business model is based on eyes on their site to surface advertising or something else.
[01:02:28] Matthew: But for a lot of others, there might be just be a bit of a a bit of vanity tied up in website visits and individuals visiting the site. But ultimately, for like us, it doesn't necessarily matter as long as the message gets to the end point and people realize how great we are.
[01:02:44] Dara: Exactly. That's all that matters, yeah.
[01:02:46] Matthew: That's all that matters.
[01:02:47] Dara: On, on, so on that, so this is a relevant tangent, but it probably does still fall into the tangent category. Did you see the generative a- the generative AI search results that they're [01:03:00] proposing, that they're moving towards? So it's gonna be a per query on the fly custom...
[01:03:06] Dara: Basically effectively like a custom app that's presented back. So I think one of the examples they gave was like if you're doing mortgage research, it'll build a calculator on the fly. So the, the-- It's the big- I think in their words, it's the biggest change in search in 25 years. So the data- Yeah,
[01:03:22] Matthew: It sounded like it sounded like that's just what, that's just what search was gonna be now, like from what I understood.
[01:03:27] Matthew: I think
[01:03:27] Dara: that's what this... Yeah, I think from the summer, I think it's gonna start rolling out in the US in the summer. So that's gonna be huge. 'Cause already everyone's talking about, oh, there's le- less traffic going to the site because information's being surfaced through LLMs. But if G- if if Goo- this is a monumental change that they're gonna basically get rid of standard SERPs and just have these custom per query.
[01:03:52] Matthew: Do you think that's because they've, they, because they... are the are the chatbots doing to Google what Google did to [01:04:00] websites in that they're just stealing Google's search traffic now? Are they, do you think, starting to really feel a bit of a hit?
[01:04:06] Dara: It's gotta be, that's gotta be part of it, right?
[01:04:08] Dara: But also just them seeing the direction things are moving in because, if you, and if you bu- you know, if you take them at face value that it's about making, what's their statement? Make the world's information accessible to everyone or whatever. So they, this is an evolution of that.
[01:04:24] Dara: It's like providing the content in the right way, but it's a huge change for them.
[01:04:30] Matthew: Yeah, but it's essentially s- and I might be wrong here, but it's essentially Perplexity, isn't it? I guess so, yeah. Because Perplexity... I was watching something the other day, some YouTuber I watch, Metalworker, and he was using Perplexity to search up like these- complex terms or like taking a picture of his book with all his formulas to figure out blah, blah, blah, and Perplexity was doing research and it was building little apps on the fly for him to then use as a calculator, and then he [01:05:00] could share those apps with people and...
[01:05:02] Matthew: So it sounds pretty close to like maybe where they've also been heading, but obviously their Perplexity are much smaller and quieter and no one's noticed.
[01:05:09] Dara: And will be dead now that Google are gonna now that Google are- Exactly. Yeah ... are gonna do this. But yeah, no it is. And it-- this is gonna be like the, this is the, this is where it's heading and if Google are gonna change search in this way, then it's gonna ripple through the whole internet because, how many times have you gone, like even with that mortgage example, like you go on some old website that's really clunky and they've got this crappy little calculator and it doesn't match your situation.
[01:05:37] Dara: There's one thing missing from it, and then you go find another calculator, and now you just, you put in your specific requirements and it builds you a custom calculator. And that's one example, but the same would be, you're trying to understand astrophysics and it will build you an interactive...
[01:05:54] Dara: It's like the the live the interactive visuals that Claude does now, where instead of just explaining a [01:06:00] thing to you, it'll build you an interactive... it's that, but on a bigger scale. So websites will, websites or whatever, whether you call, whether they'll s- you know, whether they'll still be websites as we know them today when you're delivering content to your target audience if you're not making it personal to them now, people have been talking about personalization for years, but it's gonna become it's gonna be an absolute necessity.
[01:06:24] Dara: And if you're delivering generic content to people, they're not gonna consume it in any way, shape, or form because they're not gonna have to.
[01:06:32] Matthew: No, and yeah, like you just been building dynamic websites that get for yourself that just collect and c- and collate various pieces. We've built a few things at Measurelab that are starting to work towards this.
[01:06:47] Matthew: We've got assets or artifacts, shared artifacts thing where people-- anyone can just create a little app or a whatever and push it up to our site, and then it's available for everyone to use. And that's that can be, that can have [01:07:00] functionality. It can just be, it can be data, it can be whatever, and it's instant and stylized and nice and easy and, but pretty frictionless.
[01:07:09] Matthew: So you can imagine-- and the way we are starting to pitch work out to clients is much more tailored and nice and less boilerplate because we're able to spin up these custom pieces that are so much m- more personalized. Even though you're AI-assisted, the output is more thoughtful and personalized because you're able to put that, just put that effort in and not all the other effort that you would normally have to get to that output.
[01:07:36] Dara: Yeah. Yeah. No, I c- I com- I completely agree, and I get why Google are going that way. It's a huge bet for them messing with search, but I think it's the way they have to go because landing on a page with 10 results and trying to look at a little snippet of text and figure out which one to click through, and then wading through some website that's been badly built is just, that's gonna be a thing of the past and-
[01:07:57] Matthew: Yeah.
[01:07:58] Matthew: Plus they're 10... their, the sponsored [01:08:00] links in Google has just got progressively worse over however long. It'd be interesting to see what the hell they do. What does that new search box look like in terms of how they advertise in it? 'Cause they're going to have to do that 'cause that's where they...
[01:08:13] Matthew: that's where most of Google's revenue comes from.
[01:08:15] Dara: Yeah, still the bulk of the revenue. And how close will how close will the other frontier models get to replacing the need? I don't know about you, but I'm using Google less and less now because I'm working in Claude, so I'm asking Claude a lot of what I would have asked Google in the past.
[01:08:31] Matthew: Yeah, I think I, I tend to go to Google When I get stuck on s- when I get, when I know that something's so new and I'm like butting my head up against something and I'm like, I'll just go and find a piece of documentation as a starting point that I can pass back into Claude. That's pretty much it really, yeah.
[01:08:50] Matthew: Yeah. It's definitely reduced hundreds, hundredfold probably.
[01:08:55] Dara: Hundredfold. I think so too. So maybe that's the fear, yeah, that's the [01:09:00] fear Google have maybe is they're thinking we have to now dramatically change search because otherwise maybe they do see the writing on the wall and they think people aren't...
[01:09:09] Dara: You'd love to know the numbers, wouldn't you? Are searches dropping overall? Is usage of Google dropping? You'd have to think it is. Even if it's small at the moment, you'd have to think it's starting to decline.
[01:09:20] Matthew: Yeah, 100%. And and I think all of this stuff is, yeah, it's a play to bring people back in.
[01:09:26] Matthew: They got this-- Some of the cool stuff they mentioned was like universal shopping cart, which was like you could have Y-y-you could set up price alerts across all these different things, but then you could just have one shopping cart that gets information from all these other sites, and you just have one Google Shopping cart, and it can purchase from all these different places for you.
[01:09:46] Matthew: Again, I think t- going to-- the last thing actually, maybe we'll just jump into that, is Gemini Spark, which is their sort of 24/7 agentic assistant with like Gmail integration, always on. You can [01:10:00] imagine that being able to go out and buy things for you and have maintained all this sort of stuff.
[01:10:04] Matthew: Omni might be part of it. Y- yeah, it feels like their big play. How do we save search or how do we evolve it to such a point that we're not so reliant on selling those blue links at the top of, up top of search every time? 'Cause that feels like they're smart people. They must know if that's-
[01:10:22] Dara: You'd think, yeah.
[01:10:23] Dara: Yeah.
[01:10:24] Matthew: That's friction now, isn't it? P- previously, when it was the only way to get information, you tolerate that there's these advertisements at the top, and you just go and get the information and move on, and maybe you click on one of the advertisements. When there's much less friction to go and just ask Claude, who isn't gonna pass you advertising, the same question, then you're gonna do that.
[01:10:46] Matthew: People are gonna do that.
[01:10:48] Dara: Yeah, and if Claude can manage to stay profitable without having to rely on advertising, they could be in a really prime position, because OpenAI have had to do it, and Google are built on advertising. [01:11:00] That if you were, yeah, if you were a gambling person, maybe you'd think, Anthropic are in a good position in that respect.
[01:11:07] Dara: But again, yeah, who knows? But it's gonna be really interesting to see how those changes around search affect, affect Google. Just br-br-bringing it back then to the bringing it back to Next and IO and this focus on, agents being the future, but with this subplot, even if, again, we're slightly biased, maybe thinking about it from that governance and semantics point of view.
[01:11:29] Dara: I want to get your p- your hot takes on a couple of things. I'm gonna put you on the spot here. You've not been, you've not been prepared for this let's see if this makes the final cut. I was thinking about agents, so with Google like really pushing this agent approach this might have come up before actually, but if the models get good enough, are agents just a flash in the pan?
[01:11:50] Dara: Will agents be needed if the models themselves get good enough? Is there gonna be a point where the models themselves replace the kind of need for specific agents doing specific [01:12:00] things?
[01:12:02] Matthew: I think it depends because you could just increase the power of the-- say it's like context a little bit.
[01:12:09] Matthew: You can push up the context to a million, and what we found with study, w-with the study we did with SEAM is y- they'll go-- you can ask the same question, they'll go and find the answer, but on the one side where it's just going to find the answer at raw MCPs, it eats a hell of a lot more context takes a lot longer and those are the two main pieces to, to get to the right answer.
[01:12:32] Matthew: I still think that would exist regardless of how powerful the model was if you give it the right information. Yep. And maybe that's agents or maybe they cracked sort of it being able to learn on the fly. So if you've got a model that you can teach things to and teach processes to without having to define the agent, then maybe at that point agents become redundant.
[01:12:53] Matthew: Otherwise, I think pre-baking instructions of how to access and use tools or giving it the [01:13:00] option to grab that information is still gonna give you a quicker route to the answer.
[01:13:03] Dara: Yeah. Yeah. Or it's the, or it's the models getting better and then having a, really rock solid context layer, like using something like SEAM.
[01:13:14] Matthew: Yeah, that's what I mean. Like it-- yeah, I think y- it-- I'm trying to think, is there a point at which that doesn't matter? I think it relies on technology that doesn't exist yet. If they can get to a point where they can learn- what your model is and take that in and that be part of their parameters and their dataset moving forward, which they can't do right now.
[01:13:37] Matthew: They have to be, it has to be based on a training set. Then you can imagine just a passive learning entity that you don't necessarily have to keep teaching things. But until that point, having semantics and having context and instructions is crucial to do it better. And maybe a better model comes out that [01:14:00] kind of will get to the answer anyway, like 100% of the time, but it'll cost more.
[01:14:05] Matthew: It'll be worse for the environment.
[01:14:06] Dara: Yeah, true. Yeah. Another thought I had something we didn't actually cover but they announced a Looker MCP. I'm saying this is from my research partner. It's probably happened two years ago, not at Next or I/O, but a- apparently, according to my buddy they announced Looker MCP server.
[01:14:23] Dara: And it got me thinking something else that we have talked about maybe a bit on the pod, but certainly outside of the pod. We've talked about, whether traditional dashboarding and BI tools are gonna become a thing of the past, and I took this bit of news, which I think you're looking up to verify if it actually is news or not is almost a sign from Google that they think that is the case, that Looker as a BI tool- Yeah
[01:14:48] Dara: they're almost starting to navigate around it and expose the semantics to agents, which would then... There's a, look, it's like with everything, like it's not black and white. It's not saying [01:15:00] dashboards are dead or BI tools are dead, but there's a need for static dashboards. But actually, the days of that being the output are changing, and if you can connect an agent to a governed data source, then for a lot of your, a lot of your typical business questions, you're not gonna need to have them surfaced in a dashboard or have to go in and use a BI tool to get that answer.
[01:15:24] Dara: You're just gonna be able to get it through through your LLM or in some other way.
[01:15:29] Matthew: Yeah, it's one of the, it is one of the big the big sort of theories we've seen is that tying up the semantics to individual The tying of semantics and the models to individual properties like Looker, like warehouses is the wrong way to go about it.
[01:15:48] Matthew: And I don't think maybe they... you're gonna pay a lot of money for Looker ML, and you're gonna spend a lot of time, a lot of resource, and a lot of effort building out the models inside it. And if that is locked in there just for static dashboard BI [01:16:00] reporting when all of this other stuff is going on around you, then yeah I, that seems a crazy thing to spend your money on.
[01:16:07] Matthew: And you'll probably start thinking how can I do, solve this problem and this problem? Oh, there's a tool over here that lets me do BI and expose the models and whatever," or maybe I'll just cut out the middleman and have Scene.
[01:16:18] Dara: Yeah. Yeah. Which would obviously be our completely unbiased recommendation.
[01:16:23] Matthew: Yeah. We got really bi- we got really biased in this in this podcast.
[01:16:26] Dara: Yeah. I've lost my BBC my BBC credentials.
[01:16:30] Matthew: Yeah. No, I think I do but ultimately that was the, one of the reasons we did it is if you tie it to any individual thing you're governing a part, a portion or a slice of functionality or data entry.
[01:16:42] Matthew: And separating that out and putting it between any LLM or dashboard and the data means that everything is governed regardless and is governed on purpose and always. Yeah.
[01:16:54] Dara: I agree, 'cause otherwise it's just gonna go... There'll just be multiple different [01:17:00] domain specific or product specific semantic layers all competing with each other within your overall kind of estate or whatever you're, and actually it's maybe the wrong, it is the wrong way to be thinking about it. That's got to be a, it's gotta be a meta layer that sits, yeah, that sits between AI and all the different sources. That was an unintentional, that genuinely wasn't an intentional plug for Scene. That was a genuine thought having seen Google yeah- Why is it
[01:17:29] Matthew: written on our show notes, "Plug Scene"?
[01:17:31] Matthew: Is that what's that? Shh.
[01:17:33] Dara: You're not meant to read that bit out loud.
[01:17:34] Matthew: Oh, sorry.
[01:17:35] Dara: I think we're there, right? I think that's a... we'll have missed a whole bunch of things. I think I said last time it was like 250 updates or something, but a lot of them are, we've changed the name of this is out in general availability now.
[01:17:45] Dara: But yeah, they were the kind of, certainly the headline updates at least that were most relevant to us and what we do and what we care about.
[01:17:52] Matthew: Yeah. Yeah, definitely.
[01:17:54] Dara: Okay, let's leave it there. Until next time.
[01:17:56] Matthew: See you later.
[01:17:58] Dara: That's it for this week's episode of "The [01:18:00] Measure Pod." We hope you enjoyed it and picked up something useful along the way.
[01:18:04] Dara: If you haven't already, make sure to subscribe on whatever platform you're listening on so you don't miss future episodes. And if you're enjoying the show, we'd really appreciate it if you left us a quick review. It really helps more people discover the pod and keeps us motivated to bring back more. So thanks for listening, and we'll catch you next time.