#137 Which LLMs stand the test of time?
Dara & Matthew explore AI competition, developer tools, enterprise SaaS governance, and Google's AI strategy and TPU infrastructure edge.
In this episode of The Measure Pod, we welcome Daniel Hulme, an expert in artificial intelligence and the founder of Satalia, a company recently acquired by WPP. Daniel shares insights from his extensive background in AI, which includes studying at UCL, where he completed both his undergraduate and master’s degrees, as well as a PhD focused on modeling bumblebee brains. He discusses his experience in academia, including running a master’s program in applied AI and his current role as an entrepreneur in residence at UCL, where he assists in spinning out deep tech companies. Join us as we delve into Daniel’s insights on AI, consciousness, and much more.
Share your thoughts and ideas on our Feedback Form.
Follow Measurelab on LinkedIn
“The culture in Satalia is, let’s not do things that we don’t think are intellectually credible.”Daniel
“I often say the more adaptive you are, the more innovative you are, the more intelligent you are.”Daniel
Show full (AI-generated) transcript
Lizzie: Hello and welcome to the MeasurePod by MeasureLab, the podcast dedicated to the ever-changing world of data and analytics with your hosts, Dara Fitzgerald and Matthew Hewson. Between them, they’ve spent more years than they’d like to admit wrestling with dashboards, data quality, and the occasional Google curveball. So join us as we share stories about how analytics really works today and where it might be headed tomorrow. Let’s get into it.
Dara: Hello, and welcome back to The MeasurePod. I’m Dara, joined as always by Matthew. How are you doing today, Matthew? I’m all right, thank you. Yeah, not too bad. Yourself? How’s your existential dread on a scale of… I suppose we’ll go with the standard zero to 10, one to 10.
Matthew: Pretty low at the minute. I think we’ve both picked up a block off the back of this conversation that we’re going to have later today. I’m halfway through that, so I’m more confused. which is the hidden spring about figuring out what consciousness is. I was following it for about four or five chapters, and I think it got to the middle where he started to bring physics into things. I’m happy. Oh, wow, really? So yeah, I’m a bit… But it’s really got rid of all the other stuff. replace that in my brain, so that’s good.
Dara: Serving some purpose. I didn’t sign up for the physics bit. I’ll probably get this wrong now, but he was a neuroscientist, right? Then he retrained as a psychologist or a psychiatrist or something. I didn’t realize he was going to bring physics into it as well. That’s maybe a Maybe that’s just one step too many for me.
Matthew: Yeah, he was. I think, yeah, he did the first one, found it all too grounded in the physiology, and then needed to bring in the others. Yeah. I don’t remember his name. I was just talking to him.
Dara: Doesn’t matter. He has a name.
Matthew: He has a name. He’s got enough money and books and praise. He doesn’t need to give us that.
Dara: play it out. Hidden Spring. Hidden Spring, yeah. No, I’m looking forward to that one, but I’m a bit behind. I haven’t even read the terrifying… I’m looking at it because it’s over there. If anyone builds it, everyone dies. So I haven’t even got to that one just yet. I’m midway through another book. So I’ve got these two cheery ones ahead of me. It’s turning into a little book club.
Matthew: It is, yeah. The Measureport Book Club. Whoa, whoa, whoa. Hold your horses, Dara and Matthew. Since we recorded this, which was, what, a week and a half ago? The three or four massive bits of news came out, so it felt very odd if we just didn’t talk about them at all in the introduction to this podcast. So this is a little addendum to the news you’re going to hear in a minute about the things that have happened since we recorded this introduction. Those things being, number one, Google finally releasing Gemini. which looks, well, is very cool. 3.0, sorry. Yeah. That’s a good point.
Dara: They released Gemini a while ago.
Matthew: They released a Gemini. It absolutely blew everything out of the water in terms of benchmarks, was way above everything else. They released it instantly, which was very nice. Top marks to Google for doing that. So we have been playing with it. We have had access to it. It is really good. It feels snappy and different and very smart. You’ve played with it a bit as well, right, Dara?
Dara: I, no, I have, I have, um, I’ve played with it in antigrav. I’m giving spoilers away. Now I’ve played with the antigravity, which people are, yeah. Well, what’s he talking about? Um, I haven’t really played with it outside of that much, apart from a little bit of messing around with nano banana pro. Oh, I’m doing it again. I’ve only, I’ve only used all the things you haven’t mentioned yet.
Matthew: Yeah. Okay. Well, it is available pretty much. I think it’s now available to pro users. It was available in enterprise pretty much straight away. It was available in antigravity, which we’ll come on to in a second. I’ve now seen they’ve added it to the Gemini CLI, so it’s pretty much everywhere now. Go and have a play. It seems really powerful and cool. And antigravity, which was also announced on the same day, is Google’s IDE, so it’s essentially a clone of Visual Studio Code that they’ve kind of built out. Think Cursor and Windsurf and things like that. It’s got a real focus on these kind of agentic workflows. It kind of has like an inbox where you can set various tasks off to do things, and then they’ll reply to you in your inbox, and there’s all sorts of interesting and cool things in there. It looks cool. It is cool when it works. Yeah. At the minute I found, and I think you found the same thing, Dara, it tends to fall over quite a bit and sort of like say that I’ve encountered an error and there’s a model limits that keep appearing and hitting up against at the moment.
Dara: Yeah, I had the same. And actually I haven’t used it in a couple of days, so I don’t know if that’s still I’ve abandoned it for now. Yeah. Which is not great, is it?
Matthew: I think I can see it. I can see the vision and I’m pretty convinced in like three or four months time, I can imagine myself living there. But right now it just couldn’t, it wasn’t competing with Cloud Code in VS Code for me. So I went back. The other thing, there’s some slight differences like with Cloud Code, you can plan and you can make these plans and then tweak them and get to a certain point. Certainly with the way I set Antigravity up, it makes a plan and it just cracks on. It doesn’t sort of stop and say, does this seem like a good plan? So it’s a different sort of way of working and archetype a little bit that takes some getting used to, but it’s really interesting.
Dara: Yeah. And I set mine up similarly because I think it’s the recommended settings. I think it recommends that you do it that way. And I’ve found the same. And sometimes that’s handy because you don’t want to be having to approve everything if it’s making sensible suggestions. But yeah, occasionally you are like, whoa, you’re getting a bit carried away. I didn’t actually tell you to do that. I said, make me a plan. And now you’re going off and building things. So a little bit of tinkering with that.
Matthew: I’ve hacked the Russian nuclear authority. Wait a minute. Yeah. And the other thing, the other key feature, I suppose, of it is that it has inbuilt browser capabilities. So it has an extension that goes into Chrome, so it can build web UIs. It can go off to the web UI. It can look at it and go through journeys and understand what’s happening, pass it back to itself and loop through and fix its own… its own issues, which I guess is probably a precursor that Chrome DevTools MCP that came out recently was probably a precursor to this. It seems like it’s using a similar thing to do the Chrome. Do you remember the Chrome? Is it called Chrome DevTools MCP? Yeah. I think so. Yeah. Yeah. Yeah, I think you were miles away. I thought it was a blank look, but I think it was just your face.
Dara: Yeah, it’s just my regular face. But yeah, no, I know what you mean, and I think you’re probably right, because Google often do that, don’t they? they’ll have a version of something over here and then they’ll use that almost to bake it into something else. So I think you’re probably right. I don’t know if it’s just what I’m using it for, but I haven’t got a huge amount of use out of that, but I can certainly see why it would be useful. The only thing that’s happened to me, I think this says more about me than the antigravity or the AI models, but My experience so far is it giving me screenshots, telling me that the screenshots prove it’s done something when it hasn’t done it. And I had to say, yeah, that screenshot doesn’t actually show what you’re telling me it shows. I was like, oh, right. Okay. I’ll go back. But I’m starting to think that might be user error because it’s happening to me far too often.
Matthew: Yeah, I’ve not done much in terms of… A lot of what I’m doing is data engineering-y stuff and data form and bits, things like that. So yeah, I don’t do screenshots. Next time I do any sort of web development, I’ll definitely give it a whirl and see what it produces. Next up, I get… And this is, again, off the back of this Gemini 3.0 release, is Nano Banana Pro. They seem to have lent into NanoBanana now. It was originally Vertex AI image, and now it’s NanoBanana Pro, so they’ve just adopted it.
Dara: Yeah, everyone was calling it that. And I think that’s what everyone was searching for. Anytime I tried to use it, I just Googled NanoBanana. So they probably looked at their own search volume and thought, yeah, we have to just accept this now, it’s NanoBanana.
Matthew: Yeah. So it’s a souped up version of NanoBanana, which was already pretty much the, I would say, the best image model that was out there. And yeah, it’s just sort of dialed everything up to 11. Like the text is pretty much perfect on anything you put in. People have been using examples where they like put in long text sheets or papers or whatever. And it’s been able to produce like infographics, all perfectly legible writing in it. The character is able to take in multiple different objects that you can upload it now and get and combine them together with real sort of memory of what the original image looked like. It just seems pretty amazing. And I saw some examples of all the tech founders. Somebody had done a lot of tech founders photographs when they were on holiday or on a piss-up around some Asda car park somewhere. And it was like Jeff Bezos and Mark Zuckerberg and stuff. And it looked really real. It didn’t look like the over-saturated, over… What’s the word? Contrasty image that you normally get from an AI. It looked like someone had taken it on a film camera and it was… So was it not a real photo? No, it wasn’t a real photo, no. They weren’t pissed off in an Oscar car park.
Dara: Oh, what a disappointment.
Matthew: Despite the rumors.
Dara: I was choosing to believe what I wanted to in that story.
Matthew: Well, I could go get you the photos and you can just pretend they’re real if that’s what you want to do.
Dara: I’m easily convinced, to be honest, yeah. But you’re right, it’s the stuff that, like the improvements to the image generation itself is obviously a big deal, but it’s the type of use cases that I thought was quite cool, like to be able to use it for, you know, creating diagrams or even You know, I saw, I mean, you did this, but I saw a couple of other examples on LinkedIn where you put a load of thoughts, whether they’re in a document or various notes or whatever, and then get it to create a whiteboard image. So yeah, it looks like something you’ve actually drawn up on a whiteboard based on, you know, a whole bunch of thoughts that you feed into. And then it crunches them all down and distills them into this nice. visual view of your thinking, which could be great for, um, I mean, for example, I would find that quite useful because I don’t particularly like using a real whiteboard, maybe partly because I’m left-handed and I tend to smudge everything. I just wipe it as I’m writing it. Um, so to be able to do this, it’s, it’s a kind of helpful way to, even to, to process your own thoughts and something, if you’ve written a whole load of handwritten notes or something, um, cause it’s decent at reading handwriting. I mean, I haven’t tested it on my handwriting, but… Yeah, I don’t think… I think that’s the ultimate test.
Matthew: I wouldn’t be surprised if we could get hired by the frontier models to see if they could figure out… To feed in really, really bad examples of… Yeah.
Dara: Yeah.
Matthew: Because I’ve seen it… Someone take a picture of a page where they’ve written out a mathematic formula at the top, and then it’s solved the mathematic formula. and returned an image with the workings in the same handwriting as the thing was written at the top. And it’s, yeah, it’s pretty crazy. And yeah, I’ve created like diagrams, conceptual sort of diagrams of how I want like these things to be structured together. Someone else on the team made an architecture diagram of various different components of things. And I’ve tried that stuff with other movement models before, just as an off chance, and it’s just crap. Yeah. These aren’t for that.
Dara: Well, they are now, which is pretty cool. So to summarize and wrap up and end, this is the best model available and there is nothing else better.
Matthew: Yeah. Well, enter Anthropic, who then couldn’t let sleeping dogs lie and came out with Claude 4.5 Opus. which they claim to be the best agentic model for agentic coding and computer use and all of these other metrics that Google released. I don’t know if they’re starting to selectively use different benchmarks and measures, because Google looked so far ahead in their release documentation, and then Anthropics released a load of stuff, and Google looks much more tight, closer to like ChatGPT 5.1 and Claude 4.5 with 3.0, and then Anthropics up here. So it’s so hard to tell. They’re beginning to sort of game the system and pick and choose the right metrics and benchmarks. But yes, they’ve come out and released their model, which they now claim is the ultimate in terms of agentic coding and things. They’re definitely going down the business coding and computer use world, Anthropic, versus some of the wider use cases that Google’s gone for. And I haven’t massively tried it. I haven’t really touched it that much, to be honest.
Dara: No, I haven’t a huge amount. One thing I was keen on, so they’ve improved some of the products that use the models as well. And some of those have been quite Well, one in particular, so you don’t hit the chat session limit, I don’t know if that’s what you call it, where it annoyingly tells you, oh, there’s too many messages in this chat, you need to create a new one. And at least now with the memory, at least that’s not as painful as it used to be when you’d have to basically start from scratch. But now it’s condensing that down, that context.
Matthew: So you still have the quota limits, but you don’t have that particular So it’s like in Claude Code where it keeps compacting. The longer you go, it compacts things back down and carries on again. Yeah.
Dara: Yeah.
Matthew: So you can keep going in that conversation.
Dara: In theory, indefinitely, but obviously there is a limit in terms of like, you’ll hit the usage limits then as well. But that’s quite handy because I have that quite a bit. If you’re using it for, you know, as a back and forth, if you’re trying to figure something out, I don’t know if you’ve got, you know, like New car insurance quote, and you’re asking it to look at it and provide quote comparisons and whatever. Suddenly, you know, your conversation hits that limit and it’s been quite annoying. So, um, that’s an improvement. That’s not to do with the model itself. Um, I haven’t really tried out the model. And again, I think depending on your use case, it’s like you said, I think the companies themselves are selectively looking at where they might be stronger than others. But in terms of as an end user, I’m not sure yet what would make you use Opus versus Gemini 3.0.
Matthew: No, I do think overall, as a wrap up, I do think Google ends 2025 on top. Because one thing that was really interesting that struck me, when Anthropic announced Opus 4.5, Google also announced Opus 4.5 as now available in Vertex AI, Claude Opus 4.5. And it really made me think about how much might they’ve got in terms of the infrastructure and be able to service all these other models as well. They can profit off of Claude. I think they actually own part of Claude, if I’m not mistaken, Google. So yeah, it did make me think, yeah, you probably could argue that it’s still the best model 3.0 for those two days. And the infrastructure and everything else they’re building around it is just so big and mighty.
Dara: Which they’re providing to all of the other, well I say all of the other, maybe not the Chinese companies, but all the other US AI companies, they’re all being built at least partly on Google infrastructure. So like you said, they’re going to make money no matter who is leading the race.
Matthew: Yeah, exactly. So yeah, that was the news that happened in the couple of days since we recorded the news. So back to you, Dara and Matthew.
Dara: We talked about NotebookLM last time, so there’s actually been more updates even since then. So there seems to be a flurry of things happening with NotebookLM. I also inadvertently answered one of my, well, it wasn’t a question, but I kind of guessed at the number of sources and I said 200 in the last episode, which I was wrong. I was close. It’s 300 in the enterprise version and it’s 50 in the regular version. So just to clarify, in what way were you close? I was hoping you wouldn’t pick up on that.
Matthew: On a scale of one to a million, you were close, I suppose.
Dara: 200 is a number. 50 and 300 are also numbers. So in that respect, I was pretty close. No, you did well there, yeah. I could have said yellow, which would have been nowhere near.
Matthew: I mean, I didn’t understand when you said 200, it was like 200 megabytes.
Dara: No, so individual sources. So 200 is quite a lot.
Matthew: No, it’s not that, is it? It’s 300. This is the problem with salty news. It starts to bleed into the real, people start taking this as facts.
Dara: It does. There’s going to be people out there basing all of their work on relying on 200 sources and they’re either going to have less than that or they’re going to have more than that. So I’ve, well, I’ve desalted, unsalted, I’ve unsalted that particular piece of salt, the update. So deep research, bit of a theme, isn’t it? So they’ve added in deep research. So I’m going to talk about, like I’ve used it, which I haven’t, so I’ll get that disclaimer out of the way, because again, I think it’s US only initially, or it’s just a gradual rollout and I haven’t seen the feature yet. So I haven’t been able to get my hands on it, but deep research is you’re going to be able to use it to go off and actually find the sources for you. So it looks like there’s two different modes or options. You can do fast research and you can do deep research. And it’s like a research agent that will go off based on your brief and actually find you sources, which you can then actually import in to your notebook. as sources, if you want. So it gets a bit meta, you can either add the sources, and you can add the compiled report that the research agent might build for you. And then you can do further prompting, research, whatever yourself within your notebook based on that. So quite cool, I think, at least in terms of the potential of it, because, you know, today, you’ve had to go and actually find those sources yourself. manually add them in. But this will actually go off and do the research for you and then suggest the sources. One thought I have about it, though, is it does say that there’ll be source grounded, but it feels a little bit like, maybe anyway, that this is going to slightly erode some of the value of NotebookLM, potentially, if it’s going off and finding sources. Those sources could be anything. They could just be some conspiracy theorist’s blog, and it’ll go and pull that in. So if you don’t go through and check all the sources of that, you are potentially going to be moving away from reality a little bit. It might still not hallucinate, but you don’t have the same control maybe over the sources that you’re including.
Matthew: Yeah. I mean, it could be even worse than that. I don’t know if… This probably pops into my head because I just watched a video about it this morning by… It’s a YouTube channel called Computerphile. They do lots of the to the depths of maths and computers and things like that. And they were reacting to a statistic that it’s got up to 50% of new content on the internet is LLM driven and written.
null: 50%?
Matthew: Yeah, that’s the number they were working with. They were pointing out the fact that then things like this, like Notebook going out and doing research and pulling in information is probably pulling in a good chunk of LLM written things and where did that LLM get it from and how deep does that go? And it gets to the point where everything is just kind of eroding away in terms of, you trusted information that you can actually reliably source. There’s another channel called… I can never say this. It’s like Kerber Start or something like that. It’s this educational channel. I think I might have mentioned this before. They do research for like 90 days to six months. I can’t remember now. They did it for a long time. They got loads of researchers. They go really deep. When NLMs came along, they got excited that they could just cut out swathes of that research and just sort of get it to create stuff for them. They instantly got rid of like 25% of stuff of being not reliable, not sourced. But then as they got further into it, they realized all the stuff that is sourced, if they dig a little deeper, was actually coming from an LLM-created thing, and that LLM-created thing didn’t have a source, so they couldn’t use it. And they even saw videos start to pop up on the subject with those unsubstantiated claims and facts that had come from an LLM in people’s factual videos. So shit, that’s not good.
Dara: Yeah. Layers of shit.
Matthew: Yeah, I don’t know. Somebody did raise an interesting point in that, so I’m going off on a tangent here, but they said essentially exactly the same thing happened with email. In what way? You used to have your email and you’d get email from your person or from some newsletter she signed up to. It’s got to the point now where 90% of the email you get is spammed from unsubstantiated sources, AIs, all the rest of it. and sophisticated systems have spun up to just filter all of that noise out and just leave you with what you actually care about. They started to poke at like, I wonder if it’s going to have to get to the point where there’s going to have to be sort of spam systems sitting on top of browsers that just filter out all of this nonsense that is being generated by half the internet. It may even be that frontier models have to figure out how to do that to keep training on actual data and not just themselves over and over again.
Dara: Yeah, no, it’s true though. It is a worry. And if you’re outsourcing more of your research to AI, and then it’s just a… What is it? It’s a vicious cycle. Yeah. Snake eating its own tail. Snake eating its own tail. Exactly. Yeah. So yeah, we’re going to announce the news and say these features are available, but then in the following breaths say, don’t use them because they’re all going to be… A lot of rubbish, which they won’t. Use them with caution. Exactly. Use them with caution. And I think with all of this, the kind of theme running through it is don’t just trust it at face value. You can’t just move away from checking things. You can’t move away from looking to validate this, which is, it seems like common sense. Probably isn’t in a lot of cases, but yeah, don’t just favor convenience. entirely and just outsource all this and not check what’s being produced. That’s the kind of bottom line really, isn’t it?
Matthew: Yeah, and there’s a bit of a segue there into another news item, if I may be so bold, because I think that’s what Google’s trying to get at with their new code wiki. So they’ve released this. So one of the problems you’ll find whenever you code with LLMs is that most of them are trained from 2024 or early 2024. And that means that often a lot of the libraries they reference are out of date. It doesn’t even know about new LLM models. It’ll regularly make issues where it has to go and research through various places to try and get the latest documentation and update its thinking in the current context. Google has created, to quote Google, a new perspective on development for the agentic era. Gemini generated documentation always up to date. I think what they’re essentially doing is they’re sort of curating all of the open source repos out there and continuously getting Gemini to generate new documentation based on the latest updates that can feed into agents so that they always have the newest information, the library updates, even if it wasn’t in their training data.
Dara: So you would point to this, would you? Is that the idea that if you’re building something, you would point to this as the kind of… source of trees?
Matthew: I would assume so, yeah. There’s already some open source versions of this. Context 7 is one that springs to mind. It’s like an MCP that you can point at your LLM, it will go and check the latest documentation. At the minute, there’s just a notify me when it’s available. So it’s a coming soon thing. Oh no, actually, sorry. That’s for trying it with your own repo. So you can point it at your own repository and get it to continuously update and write documentation for it as well. So that’s kind of my guess at what this is for. This is why I linked it to that, because it seems like it’s a way of trying to not create crap code from bad information. It’s trying to give itself new information from repositories and your own repositories. So there is hope. Yeah, there’s always hope. We’re now going positive, right? We’ve talked about this. We’re trying to not go into the doom and gloom every time we talk about AI and try and sort of lift up at the end.
Dara: We’re a roller coaster. This is what we do. This is our style. It’s like go from the terrifying back to the positive. It’s the contrast that makes things interesting, isn’t it?
Matthew: Yeah. Like the in other news segment of local news. Yeah, exactly. Yeah. Some horrendous crime. And then in other news, a puppy’s been born.
Dara: Yeah. The little feel good stories. Yeah. Yeah. Well, on the less terrifying front, I guess, another update or a set of updates is from Google, which is in the Google Shopping. Again, this is US only. bit of a, not a bit of a theme or recurring theme, but so far it’s us only, but they’re introducing more AI based improvements into Google shopping. So there’s a conversational element to it. So in Google’s AI mode, it will link it up with Google shopping. So if you were saying something like what’s the, I think the example they give is a good face cream for winter, winter season or something, and it’ll bring up a comparison or it’ll bring up product images and you’ll be able to. you know, link through to Google Shopping from there.
And then they also have the, I’ve forgotten what they call it now, but the agentic e-commerce or whatever their name is for it. Trying to find it now because I would open somewhere. Agentic checkout, I think this might be what they’re calling it. But this is where you could do things like you could set your price you’re willing to pay for something. And if it can find a product that matches that price, it will actually purchase it for you using Google Pay. It does say it does ask for your permission. I’m not entirely sure at what point it does that, or if you can bypass that and say, if you can find it, just buy it for me. But this is something we’ve talked about a couple of times before, both with Google and with OpenAI, where they’re moving towards this kind of end-to-end. You’ll be able to purchase the product without ever actually going onto the website necessarily. Yeah.
Matthew: They announced this, most of the stuff in that article. IO25, I think. I’m trying to remember when that was. I think that was before NEXT25 even, but it looks like they’re actually pulling the trigger on it now.
Dara: Yeah, I don’t know. That was what I was wondering about with the agentic checkout bit, because I thought that wasn’t news, but it seems like it is.
Maybe they talked about releasing it and it hasn’t actually come out until now.
Matthew: Yeah, I don’t think so. Because I’ve seen in that article as well, they’ve got stuff around price drop and automatic buying and things like that. So you can say, check if this goes below 45 quid. If it does, buy it for me and the agent, or just buy it with your Google Pay and things. Or I remember very specifically all of that stuff, like IO25. So just another example, I suppose, of Google being a bit more tentative or announcing things way too early, or both.
Dara: Yeah, there’s another interesting bit to it, another interesting feature, which is that an AI bot will actually call the store for you to check that it’s got the product in stock. I think it’s going to be really interesting. They even say in the release article about this, that merchants will be able to decline these calls, but it’s going to call up and say, you’ll get some kind of message saying, this is an AI calling you on behalf of the customer. And it will ask you questions, whatever, like, do you have this product in stock? What’s the price of it? Whatever. But it’s going to be quite interesting because whether people actually accept those calls or not, because that could become just a new type of spam.
Matthew: I think it’s almost like you would have to, and I wonder if Google sells the solution to this, but you’d have to have an AI on the other end that answers the call and replies with the information. Because as you can imagine, if any Tom, Dick, or Harry can be putting in calls to X number of stores to check stock, there’ll be thousands of them.
Dara: Yeah, and you don’t want that to rely on humans answering each of those calls. So yeah, it makes sense. If you’ve got an AI barter agent making the calls, you should have one receiving them as well.
Matthew: and just do that jabber. Have you seen that jabber speak that they do, AIs do? No. They can just communicate really quick. I’ll try and find an example of it, but it’s just this… It almost sounds like the dial tone from the 90s and they just… They can communicate well faster and then just hang up. It’s quite terrifying.
Dara: I mean, are we ever going to talk on phones again or are we just going to have… Yeah, I didn’t quite imagine a little iBot actually physically picking up a phone and calling another one.
Matthew: Well, it’s because you haven’t got any whimsy in your life. Yeah.
Dara: I just lack imagination.
Matthew: Yeah. I assume they were all robots. Tesla’s robots picking up the phone.
Dara: Probably is. Probably is. So to go from the whimsical back to the terrifying again, there was the open AI. No, they didn’t. That’s wrong. I’m getting confused between my AI companies. It was anthropic. released the fact that this happened. So they spotted, identified that financial institutions and government agencies were being hacked or attempted to be hacked by…I feel nervous saying this, but this was in the news that they’re saying it was Chinese state backed hackers that did this. I haven’t dug into it enough to find what proof they have of that, but that’s what’s in the in the news. That’s what they’re saying it was. But what was different about this compared to previous hacks was that this was almost entirely AI. So there was very little human involvement, which is different, I think, from previous attempts where AI was used, but there was still heavy human involvement. But I think with this attempt, something like 70 to 80% of the actual work was done by AI, and there was very little human involvement, which was a little bit different.
Matthew: Yeah, I saw this from two different angles because I saw the Anthropx original study, which is mostly what you just laid out there, but then I also saw the BBC News article on it, where it was a lot of security experts throwing doubt over Anthropx claims. Oh, really? Essentially making out that they were inflating it a little bit. To what end, I don’t know. I think there was some accusation that by sensationalizing it, they publicly show a that their LLM is capable of X, Y, and Z, and it kind of ups the profile and power of things. I don’t know which side of the fence I’d fall on particularly, but yeah, it was interesting. I’ve never seen somebody necessarily go after… Anthropics has always been the golden boy of releasing all this research. I never thought about it as also another sales tick to show increased capability of LLMs. You’re just not cynical enough. No, I’m full of whimsy. You are.
Dara: You’re all about the whimsy and just lacking that cynical edge. Yeah. Yeah. I must admit, I hadn’t seen that. I wasn’t aware. I mean, it makes you, when you think about it, well, of course there’s a possibility that that’s the case, but I had, having only read the Anthropix kind of news around it, I was thinking, oh, this is more of an argument again. We’ve talked about this before, how they’re open and they’re willing to share their own vulnerabilities. But yeah, potentially, I mean, who’s to know what’s a PR stunt versus, or maybe it’s half and half, maybe it did happen and it’s it, but it’s an opportunity to, or maybe there’s a little bit of creative reporting within
Matthew: I’m sure it did happen, but it’s just, yeah, to what scale and how much of it did this and where they got their information from. It could also be a lot of security people who are having a sort of knee-jerk reaction and that fear-based response of people deny things when they don’t want it to be true. I don’t know. I’m not saying that is the case. Don’t come after me, security people. I’m just seeing it from a few angles there, but yeah, that’s that.
Dara: So joining us as our guests on today’s show, we have Daniel Hulme, who is the chief AI officer at the WPP Group and also CEO of his company, which was purchased by WPP, which is Satalia. Daniel has been working in the AI space on a cutting edge, I think it’s fair to say, for the last 15 plus years. So we had a very broad conversation with him about his thoughts on AI and where it’s heading, both in the industry and I guess more widely in society as well.
Matthew: Yeah. I mean, I think we went from the everyday to what is consciousness and neuromorphic computing and all sorts of stuff. So some of it’s very blue sky, but really, really fascinating. And he’s a hell of a bright guy to chat to. So I think people who like a bit of a bit of the abstract will really enjoy it.
Dara: Enjoy the chat. Joining us on the MeasurePod today, we’re really excited to have Daniel Hulme. Daniel, firstly, thank you for agreeing to come on and talk to us. We know you’re a busy man. Welcome to the MeasurePod. It’s a pleasure. Thank you. I actually saw you speak. I’m not going to say how long ago it was because it was quite a while ago. Um, but I saw you speak at the crap talks, strange name, but great events. Um, but I saw you speak at that in London and actually at the time I was quite intrigued. So you were talking about the work that you were doing at Satalia, but what I was quite intrigued about as well was actually how you were running Satalia, some of the kind of progressive. Um, approaches you were using to actually run the company itself.
Obviously a lot’s changed since then. So Satalia has since been acquired by WPP. So I’m sure we’re going to go into that quite a bit, but before we do, could we just get you to introduce yourself to our listeners? So you can give as much or as little background as you like, but just. A bit of a summary of your background up to what you’re doing today.
Daniel: Yeah, absolutely. So my background is actually all in AI. So my undergraduate 25 years ago was in AI in UCL. There was two people on the course. And I did my master’s in AI, my PhD modeling bumblebee brains and some other aspects of computer science. And then I did several postdocs. I ran a master’s program in applied AI in UCL for several years, and I’m currently an entrepreneur in residence at UCL. So I help them spin out deep tech companies, and I’m on the advisory board of organizations like St. Andrews and Sussex to help them understand how to commercialize research. I started a company from my research about 18 years ago. It’s been building AI solutions for some of the biggest companies in the world, Satalia. We sold to WPP four years ago, where we continue to think of Satalia as being WPP’s deep mind. We continue to apply AI to Now, to be fair, marketing and also to many, many clients. But I now run AI across WPP, so about 110,000 people. And WPP have looked after me very well, and I get to do interesting things, like I just started a company a year ago, invested in by WPP, actually, to talk about machine consciousness.
Dara: I get to you know do interesting things more interesting things and i also get to invest in interesting companies as well just before we kind of get into the kind of deeper stuff around around the work you doing i’m just curious to know how this is a big question actually to start off but how have things change so you know four years ago You had one job now you’ve got sounds like at least three jobs so how are you how are you managing those different still running study but you’re also chief officer across the group and then you’ve got the, is it consciousness that i’m saying how are you just how are you finding that i mean you’re smiling for anyone who’s not watching the video are you smiling so you’re obviously not. I don’t know.
Daniel: I’ve always had sort of a portfolio kind of career. I used to lecture in parallel to being in Satalia and I used to then do talks and still do lots and lots of talks around the world. I find that these things, they all sort of interact. If I did just one thing 24 hours a day, I think I would get really burnt out. But I’m very, very fortunate that I’ve managed to create a portfolio of things that all complement each other. But I don’t get bored. And I guess, you know, when you have a hobby, you don’t feel like it’s work. And I feel like I’m bringing all of these different things together. It’d be like hobbies that don’t feel like work at all. So I often say to people, I borrow time from the future in the same way that governments borrow money from the future. I borrow time, I think. But the reality is, is that I just got very fortunate to have this kind of portfolio of activities that are all sort of interrelated and build off each other.
Dara: And can I just to go back a little bit again, just again, I know Matthew is going to be chomping at the bit to get into the kind of like the AI tech questions. So just before we go down that road, I mentioned at the start that I’d seen you talk at an event and you were talking at that time about, it wasn’t the focus of your talk really, but you were just saying about how Citalia was doing things a little bit differently. Things like you had kind of peer reviewed and partially AI based reviews for people working as Talia salaries were transparent and set in that way. It was decentralized decision-making, all of these kinds of quite progressive approaches. How has that integrated with, you know, since you’ve been acquired by a bigger organization, has that changed or has it changed your thinking as well?
Daniel: So partly, no, it hasn’t changed. I think I didn’t make my life easy with Satalia because Satalia in of itself was like a venture builder. We wanted to try and get innovations to market as fast and as most effectively as possible, whether they be consultancy, building solutions for companies, turning those solutions into assets, productizing solutions, open sourcing solutions. Most companies just do one thing, and we know that as companies grow, they have to diversify their revenues, so they end up being a services company as well as a product company, and that’s hard for a lot of organizations. And the challenge I wanted to solve in the very beginning was how do you architect an organization that allows us to get innovations to market as fast and as effective as possible?
If we could learn how to do that, then we would be a good innovation partner for organizations that we were building solutions for. And we would actually learn how to do this to help organizations become more innovative. So how can they use these methodologies to make themselves more innovative? If you cut me in half, I think innovation is my word. And so what joined WP hasn’t changed that. We still have very little rules in Sitalia. WPF, of course, is a public company, so we have compliance and various other things. But fortunately, we had a lot of processes and structures in place to already deal with that.
I think the core problem that we’re trying to solve for is, how do you get the best diverse group of experts to swarm around the problem? And we know that hierarchies are not necessarily the most effective way of adapting, of changing, of making decisions. So the question is, could you create liquid hierarchies, weighted hierarchies to allow you to get it? And what’s interesting now is in the world of agents, this is becoming more and more important. What you want to do is get the most diverse group of people and agents to swarm together to solve the problem. I often say the more adaptive you are, the more innovative you are, the more intelligent you are. And so we often find in a lot of organizations, we implement processes, bureaucratic processes, back office processes, like OKRs, objective setting.
I think if you ask people, honestly, does objective setting work? I think most people pace it, lip service to objective setting, but yet we all do it. And I think we all do it is because there’s often a fraction, there’s a small fraction of people that actually don’t play by the rules and that are bad actors. And often organizations put in bureaucratic processes to deal with 5% of the people. Now our approach is, how do we just help those 5% of the people be identified and then move on? So we’re not imposing bureaucratic processes across the organization. So those principles still apply and we continue to try to pioneer some of these ideas. Arguably, we’re still too early. Arguably, we’re still trying to figure out how to use AI to build swarm-like organizations, but we’ve done a lot of thinking over the past decade, which we’ll continue to invest in.
Matthew: I mean, how early in the life of the company were these kinds of ideas existing? If it’s like 18 years ago, you started up and then these terms like agents and swarms and things are really commonplace now, but only within the last sort of three years, but you’re still not quite caught up. You don’t think the technology is still not quite there.
Daniel: Yeah, actually, if I’m honest, 18 years ago, so from the very beginning, I didn’t feel comfortable going down the traditional track of raising capital, putting in these structures and things like that. Now, it might be a very stupid thing to do, and we might have been a different type of organization if I’d have gone down that track. But I wanted to really… I’m not very good at doing things that I don’t believe in. In fact, it’s almost impossible for me to do something that I don’t understand or don’t believe in. So I just wouldn’t have been able to make the decision to go and implement these structures without being convinced that they were the right thing to do. So I think that’s just in my nature.
And what we find in most organizations is the culture is often highly dependent upon the attitude of the founders. And so the culture in Satalia is, let’s not do things that we don’t think are intellectually credible. And again, arguably we could have scaled the company in a very different way. We could have got VC funding and things like that, but I think we’ve been very successful in the acquisition and now I get to go and do other things. And actually, interestingly, there are other types of organizations that we’re investing that do need those types of structures to scale and to get VC funding. So I just think to build an innovative organization needed to remove these bureaucratic processes Build an organization that let’s say is trying to get some solution to scale the market as fast as possible maybe you do need much more process and discipline to do that.
Dara: So where do you think we’re at now you mentioned about how you know the kind of ideal scenario would be this perfect cooperation between the right humans and the right agents. If you can zoom out a bit and think about when you know you started satalia eighteen years ago and then right the way up to today. Obviously things have accelerated since, I guess, since chat GPT really kind of got into the mainstream, but where do you think we’re at now? How would you sum up where we’re at now from a, I guess, from a business perspective in terms of like getting to that, you didn’t call it a Holy Grail, but you know, getting to that place where you’ve got this perfect cooperation between the right humans and the right agents.
Daniel: Well, I think I maybe unconsciously saw. LLMs coming because one of the things I’ve been anal about for many, many years is naming conventions. Now that might sound really boring, but we have the same naming convention in Slack as we do in Jira, as we do in Bitbucket, as we do in our Google repository. And the idea was, if we can have a naming convention that connects activity across these different applications, one day, a magic AI will come along and be able to look inside that and be able to say, well, we’ve got these people over here doing some coding, these people doing over here document writing, but they’re working on the same project.
And the idea really was that we could use AI to then profile people’s contribution and people’s skills and venture aspirations, essentially create digital twins of people. So know what’s going on across the organization. And I guess where am I thinking at right now is there needs to be a protocol to understand what are the capabilities of this entity, whether it be a human or an agent, and then how do we discover that capability?
And then how do we make sure that that is being deployed in a way that is aligned with the entity’s capabilities and the needs of the organization. That’s the aspiration. I still think organizations are a way away from implementing that. I think that there’s an over expectation at the moment on agents and their ability to actually do their job. I often regard agents as being a little bit like intoxicated graduates. You have to be very very careful about the type of work that you’re allocating intoxicated graduates for. In fact, actually, the consciousness company that I’ve started, the first offering, the first actually product in that consciousness company is agent verification.
So how do you verify that an agent has the capability of doing its job? I don’t want to geek out too much, but there are broadly two types of verification. One is called non-functional verification. So is it secure? Is it safe? Is it Performant the second type is called functional verification which is. I expect this to be able to do a job does it have the capability to do its job effectively if i build an agent that can drive a forklift truck i need no they can drive forklift trucks how do i test for that and so what i’m consuming is doing is launching a product called verify yet which is to try to understand the capability of agents want to understand the capability of agents and you can be more confident about deploying them to work in a more effective way.
Matthew: What are the drunkard graduates that we currently have access to good at? I know there’s always going to be levels of being very prescriptive about what they do and how they function and all the rest of it, but what are they useful for right now versus where they should be left well alone, do you think?
Daniel: I would sort of slightly turn the question around, which is if you did have lots of drunken graduates, what would you deploy them to? And there are lots of things we would get them to do that we would trust that they would be able to do their job. That said, actually, there are obviously techniques to turn them from drunken graduates to something that’s much more capable.
So I guess I argue that broadly four ways of making any i smart the first way is asking a better question so you know by asking a better questions prompting like you believe to you can typically get better answers but it’s only going to be bounded by the knowledge of that brain that i love the second way of making a drunken graduate spot a smarties. giving it more context. So you might, for example, want to graduate to create an ad for you. I’m in the ad world now. So what you would probably do is give it your brand guidelines, some examples of written copy, some imagery, some examples of your products, and then say, go and create me an ad, right? And the reality is that if you do that, which only takes minutes, it’s called ragging, it’s a terrible name, but you’re just going to get as good as a drunken graduate with access to those materials.
Although it will be better than just asking you better questions. The third way of making a drunken graduate smart, actually, is training and tuning. Now, there are only a handful of models you actually can tune and train. Now, we know how long it takes humans to become experts. It takes years of trial and error, blood, sweat, tears, iteration with experts, mentorship. In our world, it currently takes months to go from a graduate brain to an expert. And as I said, there are only some models you actually can control. And then the fourth way, actually, of making a brain smart is agents. Very often, you can’t expect one brain to be able to do everything. I actually can’t expect one model to be able to create a good app because it needs to be good at written copy and imagery and 3D pipelining and all sorts of stuff. But the reality is that you create a bunch of experts, just like in the real world.
You’re an expert at your tone of voice, your written copy, your imagery, your brand guidelines, your company values. So just like in the real world, You have these different experts or agents collaborating and communicating to get to an answer greater than the sum of the parts. The real power of agents is not agency. It’s not just clicking on websites. The power of agents is getting them to collaborate with themselves and human beings to come up with solutions that are greater than the sum of the parts. It used to be called multi-agent reasoning 30 years ago. I know that most people think that agents have been around for a year.
They haven’t. They’ve been around for a long, long time. Yeah, so the real power is understanding what are these things capable of? What do I need them to be capable of? And then to ensure that they are operating in a way to achieve your goals in the same way that you would with human beings.
Dara: I read somewhere, Daniel, that you said that you can mitigate bias issues as well by using this kind of multi-agent reasoning. Do you think that still needs, does it still need to be a human in the loop on that?
Daniel: I think that does it. Again, it depends on the problem that you’re trying to solve. So all AIs are biased, right? All humans are biased because we’re trained on some data and we only have a thin subset of data that’s available to us as a human. So they’re always going to generalize the world. And sometimes those generalizations are going to be wrong based on their experience. What you can do, and we’ve been doing this for a while now, is you can actually, ironically, you can build agents that represent different corners of humanity. I could build an agent that represents a minority group, or a political party, or a newspaper, or even some sort of legal framework.
And so, for example, in WP, when we’re creating content across our supply chain, rather than only having human beings check that content, you can now look at that content from the perspective of the lgbtq community i say and we can identify weather i come to my trigger any any issues causing harm or breaking the laws so you don’t necessarily solving the bias problem your identifying where my trick communities and then you ask yourself how do i now make the content i’m much more safe traffic out of work this question but i guess there’s also the problem of.
Matthew: all of these LLMs just in their raw form are trained on the corpus of human knowledge on the internet. So just what’s out available there, maybe some scientific papers. But then those minority groups you just mentioned, there’s kind of that alignment problem, isn’t there? Where there’s an inherent bias in these models that… So are you there when you’re mentioning, say, the LGBTQ community or another minority group, there’s a load of extra tuning and training on top of that to try and deal with those biases, I suppose. Yeah.
Daniel: And so actually what you want to do, ironically, is you have to build an agent that represents that community. And of course that’s subjective. You have to make sure that it is somewhat representative of that community. There are more objective methods. So if I built an agent that represents sustainability claims, so he’s able to identify whether something’s being greenwashed or not, then that’s much easier because it’s much more objective.
The rules around that are much more objective. But I guess the point is, it’s not going to be perfect, but is it better than the alternative? Is it better that only using human beings that might only be looking at these things from a very myopic point of view or myopic group of people? And the answer is, it is better than just using human beings. Going back to this idea, you’re absolutely right. Lars Lange’s models are trained on the corpus of the internet, and the corpus of the internet is not necessarily the best representation of humanity.
I am actually interested in AI alignment, and actually that’s one of the work streams of Consium, this consciousness company, is the question is, can you align an AI, whether it be conscious or not, with some sort of value system? And actually there are different methodologies to build AIs that we’re working on to actually try to embed in the sort of very being of an AI some sort of value system. So rather than sort of training on this corpus and then putting in guardrails and hoping it doesn’t go wrong, then you could say, look, even absent of these guardrails, we think that this AI will behave in the way that’s aligned with some sort of value system. I guess just to give you a crude example, I can build an environment, a very simple environment where I have very simple agents.
I can set up the environment, the game, where they have to fly, lie, and cheat to survive. By doing that, their genes get passed on to the next generation. Or I can create an environment where you have to be altruistic and sacrificial and cooperative to survive. By working together, you unlock resources that allow you to then pass your genes on to the next generation. And so through the design of the environment and through evolutionary process, you can essentially embed a value system. You can embed cooperation into AIs. And so that’s what we’re really interested in trying to solve for. We don’t think we control, ultimately, the superintelligence. And arguably, you can’t even align a superintelligence with a value system. But I think our best shot between now and then is trying to figure out how do we embed into the DNA some values.
Matthew: I was going to say, do you think I’ve just finished reading a book, If Anyone Builds It, Everyone Dies. Do you think that that mission that you just described there should be figured out first? Do you know what I mean? Rather than just the frontier models continuing to push and push and push and assume via some arbitrary guide rails and prompting that they can control it.
Daniel: Yeah. In some respects, Consume is both an actor in that book, but also trying to solve the issues identified by that book. The hypothesis is, and this is still very much a hypothesis, is would a conscious intelligent or conscious AI be safer for humanity than a zombie AI? We’ve all heard the examples. If we built a superintelligence and we asked that superintelligence to eradicate cancer, the easiest way to eradicate cancer is to eradicate humans, right? So without the right guardrails, without designing the objective function, the constraints the right way, a superintelligence, a zombie superintelligence might be hell-bent on achieving its goal, and that could cause harm to humanity.
The hypothesis of would a conscious AI be more safer for humanity is that we value things that are conscious. We understand and empathize with things that feel pain. Arguably, we’re not necessarily doing a good enough job, but we do put in frameworks and structures and policies and legal frameworks to make sure that we’re mitigating the pain to both animals and non-human beings. And so the hypothesis is, would an AI that is conscious, that can empathize, that can value things that are conscious, actually be safer for humanity? And then the reality is we don’t know the answer to that, but this is the question that we’re asking. And I won’t go into the details here, but there’s also an argument that a conscious AI is actually potentially more intelligent than just a zombie intelligence. If we needed to build an antivirus for superintelligence, then a conscious superintelligence could be a solution.
Dara: It’s interesting that you kind of answered where I was going to go with my next question, but it was around that concept of, if you do give the machine consciousness, it’s more likely to empathize. But what we don’t know, and you’ve kind of said this, is we don’t know what it’s like to have an intelligence that’s a million times greater than ours. So it may not use that intelligence or even that empathy if it has empathy in the way that we think or hope it might. So do you think You talk about trying to create the environment or building in a value system into the DNA. I know this is very hypothetical, but do you think that’s effectively meaningless? Because if it develops that kind of intelligence, presumably it could then undo that or it could build a new environment for itself if it has that kind of level of intelligence.
Daniel: Yeah, I think, again, once you get to super intelligence, once it has the ability to be able to override its own instincts, like we do in some respects, we have instinctual drivers and urges, and hopefully many of us can control those. I think once you get to the point where you can override that, I think that then all bets are off. But the point is, between now and then, we want to try to figure out if he can create an alignment. So if we built an AI that could operate as smartly as a dog, but it had incredible capability, that it could go out there and it could run faster than us and hurt us. Could we make sure that that is aligned somehow, that it’s not actually going to, by default, harm humans? So I think we need to solve the problem before we even get to superintelligence.
And if we get to the point where we realize that machines will always circumvent their own instincts in service of themselves and their own survival, And if we start to see signals where that really is problematic, then I’d be the first one advocating that we shouldn’t be building anything close to a superintelligence. If we can’t align it, then we shouldn’t be building it. Or everyone dies. Yeah.
Dara: Pulling back a little, I think we all love thinking down that hypothetical line of thought, but just kind of pulling it back a little bit. to kind of more where we’re at today. Something else that I read that you wrote or might have said in a talk is that this kind of idea of AI ethics is a bit wrong in that humans need to have ethics and it’s humans have intent. At what point does the level of sophistication of the AI does it develop its own intent? And when should we be thinking about whether there is some valid reason to apply these ethics to AI, if you even can? Yeah.
Daniel: I mean, part of my sort of rhetoric around not having ethics is to sort of make people realize that What tends to happen is every time there’s obviously some exciting new technology, people rebrand themselves as experts in that field. For the past few years, we’ve had now mass re-description of job roles. People are claiming that they’re AI experts and AI ethics came along because people started to worry about the bias and whatnot. And the reason why I say there’s no such thing as AI ethics is partly to kind of challenge people doing that. Because I actually argue, well, ethics is the study of right and wrong, and the difference between AIs and humans is that AIs don’t have intents and human beings have intents.
And again, there are already well-established structures and processes and frameworks to scrutinize intent. You don’t need to have an ethics committee and but the fact is that there are actually ethical questions which is where we build a machine that is conscious we have the right to turn it off for example but these are not questions for business these questions for academics i mean you don’t do that i guess for the experiment i give people is imagine. very soon quantum works. We managed to get quantum machine working, and there’s going to be now a boom of people then using quantum machines, trying to build quantum algorithms. How quickly will we start to see chief quantum officers, quantum ops, quantum consulting?
People will be rebranding themselves as quantum experts, despite obviously having no experience in content without no academic experience. And the same thing’s happened for AI. And I guess partly it’s great that everybody’s now talking about AI and leaning into these big questions, but it’s also frustrating because people are just regurgitating one-on-one rhetoric, like the idea of bias in these missions. These things are always biased. You’re not going to mitigate and remove bias. And by the way, bias is not an ethical question. It’s a safety problem. What you’ve done is built a machine that doesn’t have the data to allow it to be variant or to generalize in the best possible way. It’s not an ethical question.
And so people will confuse, convolute, the questions around AI, they’ll create scaremongering. People, obviously, that don’t know any better will then buy their solutions, they’ll listen to them, and they’ll place the wrong bets, make misinvestments. And it’s just frustrating. The reason why 70% of projects fail right now, and the reason why 70% of machine learning problems fail years ago is because you’ve got people regurgitating things that they don’t know about. And I don’t think we’ve got five years to be making those wrong decisions. What do you mean by that lust?
Oh, right now, I guess people are now being told to go and build… 15 years ago, 10 years ago, we were told to build data lakes, put Tableau or some sort of an analytics layer on top, hire some machine learning experts, and then create a model where you’re able to extract insights from data or self-serve. Your employees are somehow going to extract insights from data, and that’s going to magically lead to better decisions. That was never going to work.
It was never going to work for many reasons. is machine learning experts are probably not going to stay with you for more than a few years, and you’re going to be left with models that you can’t support. People are not going to find time to actually extract insights from these. Anyway, now what’s happening is people are saying, use generative AI, and somehow it’ll magically drive value in your business. Generative AI is not valuable. It’s insanely valuable, but people place the wrong bets. They’re now using generative AI to solving the wrong problems, or they’re being told to focus on quick wins and low-hanging fruit.
The reality is that productivity improvements will be solved by your cloud provider in a year’s time. There’s no point in you spending money to build AIs to go and solve productivity problems. If you want to really succeed as a business, you need to build a differentiated business model you need to build solutions that differentiate you from your competitors and building an expense agent is not going to do that and by the way most organizations are being told they can go and build AI solutions. The reality is if you’ve never built software, if you’ve never built and scaled software in your organization, you’re not going to build AI or it’s going to be very hard for you to build and scale AI. So, we’ve been told by tech consultants and things like that to go and do that, but it’s really self-serving.
It allows them to get some money where you’re going to end up, again, left with mass misinvestment. So, then that’s what I think the issue is, is that I think that 90%, 99% of the stuff out there is noise. and then organizations will ultimately place the wrong bets. Then it makes people like me, who I think have got a proven track record of delivering real AI solutions, it makes people like me look like a snake oil salesman when we’re not.
Dara: Do you think that… I completely agree with you about the point about people putting everything in data lakes and then trying to mine all of that for insights and actually, in a way, just created more noise rather than more signal. Do you think by and large, like in the mainstream, so regardless of whether people are actually trying to build AI solutions or not, even just using what’s out there already, do you think businesses now are in a better place or a worse place in terms of being able to make decisions if they’re using?
Daniel: I think we’re at a better place. I think the more people are talking about it, I think the more that we encourage businesses to be sober about how they approach these things. I think the reality is that these technologies are not going to go away. We are going to start to see real value being generated from them. People will then see, learn how to apply them in the real ways. And I think it will have a massive impact. It’s just at the moment, there is a lot of noise.
Dara: Um, people have to navigate, you’re obviously used to working with very large companies. And for us, we tend to work with kind of medium sized and larger companies, but what if you’re a smaller business and you’re aware of what’s going on? You’ve heard about Chachi BT. You’re thinking, Oh my God, where do I, where do I start? Do you have any advice for kind of smaller business?
Daniel: Well, Sally, I would just stop playing around with it, but don’t start with the technology or the data. Start with your frictions. You’ve got some problems. You’ve got some frictions that exist across your organization to try and get your technologies, your solutions, whatever they are, out to market in the most effective way. Most of those problems actually, when you’re a startup, are more around business model. Problems are not internal inefficiencies if you’re trying to build a technology that is differentiated that you need differentiated talent and that’s often hard to find. All you need to have differentiated data so if you don’t have different talent, you don’t have differentiated technology, if you don’t have differentiated data, you’re going to have to come up with a differentiated business model, which is like open source or whatever. So I would not start with the technology, I would start with coming up with a robust business model, and then using technology to help you execute on that in the most effective way.
Matthew: I’m really interested in how you’re going about the consciousness problem from a place of complete naivety. I’m just really interested, and I appreciate you’re not going to be able to go into aching detail of how the hell you solve or go about making machines conscious, but I’m just interested in a high-level version. What is the work you’re doing? How are you going about it?
Daniel: I think there are a couple of things with regards to consciousness that we need to lean into. One is there are people, I think 60 to 70% of the population believe the LLMs are conscious and they are building relationships with them. We’re seeing now problems where people are committing suicide because they’re being advised to do that by LLMs. So we need to understand how people are building relationships with these models and to try to help people understand that potentially they are not conscious. And I say potentially because the reality is that there’s no clear definition of consciousness. If you look at the academic research, which I have done now for the past several years, it’s really… And I’m going to quote actually an academic in this space. He said it’s an embarrassment.
The progress that’s been made in understanding consciousness is not good enough. And we don’t have another 2,000 years to be arguing about these things. We’ve got five years before we potentially build machines that are conscious and then committing what Nick Bostrom said was mind crime. We could end up putting machines in situations where they are being treated horrendously, and we need to mitigate that risk. So that’s the second risk. So the question is, what is the definition of consciousness?
And the reality is there’s lots of definitions. And so I’m trying to now push a convergence around what our definition is. Consciousness is doing that partly because we’ve attracted, I think, some of the most well-renowned thinkers in consciousness, as well as I span out a charity called PRISM, which is the Partnership for Research into Sentient Machines, which is also leaning to these big questions of consciousness. So I actually have a very naughty perspective of consciousness. And the reality is that I think consciousness is now not something that I’m interested in, or less interested in. I’m actually more interested in whether a machine can suffer. Now, arguably, the ability to suffer is linked to consciousness. But I think that that’s a much more tangible question. What does it mean for a machine to feel pain or anxiety?
How do we even detect and determine whether something can feel pain? But let’s just go back to consciousness. People often confuse, convolute intelligence and consciousness. And they also think that consciousness is only a human thing. And I think we’ve shown that animals are conscious and bumblebees are conscious and octopus is consciousness. So we need to think outside of human beings, and we need to decouple the concept of intelligence and consciousness. The problem with intelligence is that there are unfortunately many definitions of intelligence. And again, one of the definitions has been propagated, I think, by Chatterby Tea, which I think is problematic, which is getting computers to do things that humans can do.
So people think that intelligence is now something to do with human beings, and it’s really not. There’s plenty of books out there that will convince you that humans are not that intelligent. I’m not going to talk about that, but animals are intelligent. Bumblebees are intelligent. And there’s actually a very good definition of intelligence that comes from a paper in the 1980s. It’s goal-directed adaptive behavior. So what you will ultimately want to do is build systems that So predict the world model the world plan make decisions learn about whether those get feedback learn about whether this is a good or bad weather be no through pain through. Reasoning and adapt themselves to the model the world so they can next time make better decisions.
Can you ask if you ask people what are the attributes you might consider to be intelligence or consciousness that might say well things like language. your ability to model the world in your mind the ability to do long term planning like it, arguably a wolf can do better long term planning than a bumblebee and arguably humans have a bigger language repertoire than chimpanzees so if you ask people what are these different these different components, like feedback, your ability to feel pain is arguably one of these components.
So imagine each one of those different components being a segment on a color wheel. And you’ve got different sizes of color of these segments based on your capability or language capabilities. And now imagine what would happen if I then span that color wheel. Now, if I had all of the colors, what will happen is that white would emerge, the color white. Even though color white is not on the color wheel, what will emerge is is white. And in consciousness theory, people sort of refer to this as being illusionism. Now the consciousness is not real, but it’s an emergent property of the interaction of these different things in motion.
And when you stop it and try to find out where consciousness is, it disappears. And so for me, consciousness is an emergent property of these different capabilities that we have developed over billions of years to allow us to be more adaptive, allow us to be more intelligent. And so for me, there is a strong relationship between intelligence and consciousness. Consciousness is an emergent property from our ability to be more intelligent. But again, and I think I’ve got my head around that. I think that feels right. But I think ultimately the question then is, for me, how do we make sure that we mitigate machines for peeling paint?
And that’s really now the principle question. What segments do you need and what What does it mean by in motion for something to feel pain? Do you have any thoughts on the answer to that? No. Well, I tried to look at insects, right? And let me give you a bad thought experiment, which is imagine if I built an environment, a virtual environment, and I trained a virtual drone. in that environment to go and forage.
So initially it’ll just be doing random things, but over time it learns that by foraging for something, it gets a reward, it gets energy, it’s able to then pass its genes again onto the next generation. So through the evolution process, you can end up creating this drone that’s able to go and forage and be happy. So imagine I then put that brain to a real drone, and imagine it hits a tree and it damages its wings and its battery’s running out. And what you might observe is that that drone probably might flail around on the floor in the same way that if I pulled the wings off a bumblebee, it might be flailing around on the floor. Now, we externally might be attributing that that is panicking, is feeling anxiety based on its behavior, which I think is one set of attributes when you look at the behavior of something to see, well, does that exhibit the behaviors that we would expect something to be feeling anxiety? Or is it just simulating it somehow?
And I guess what’s interesting for me is that there’s this new emerging technology that’s been around for about 30 years called neuromorphic. Large language models are horrendously energy deficient. They learn very slowly. They require lots of data to learn. Our brains operate on the power of a light bulb. We learn very quickly. We only need to know that that’s a pen once, and we know what pens are. We generalize very quickly.
These new neuromorphic technologies will mean that their brains operate a lot more like biology. Our brains spike. They don’t propagate numbers. They actually have spikes. And so if we looked inside the brain of a bumblebee to see the pattern of anxiety, and if we saw parallels or correlations between those patterns in a drone brain, then one might argue that that is genuinely feeling anxiety, both based on external behavior and its internal brain pattern.
Dara: Sorry i just got maybe go off on a slightly different direction but you your answer made me think of something else there which is so in your. In that example with the drone that if you kind of broaden that out to let’s say we live in a world where all of these agents are both in the you know the online world but also in the physical real world. How can we hope to manage the conflicting goals of those different. models or whatever they are, because if that drone, say that foraging drone, just use that kind of simple example, if that mission is to forage at all costs, then it’s going to potentially come into contact with other machines, which have got a different and possibly conflicting, or they may be just in the same space or territory, or they might need to use the same energy source. So how does that work in terms of maybe there isn’t an answer?
Daniel: I think it’s the same, probably the same way that we have organized our society is that we have our own internal value systems, which has been developed through the evolutionary process. We do cooperate because it is better for our species. It does allow us to create, unlock resources, allow us to pass our genes on to the next generation. So there is an internal. set of impulses and desires, but then there’s also laws and frameworks that we need to operate in. And so I think it’s going to be a combination of both. And the reality is, I think that there will be, obviously, just like in the real world, there will be conflicting interests and we’re going to have to just imagine scenarios that we’ve never planned for. We knew the answer to all of them, then I think we would create regulations that run the planet smoothly. But the reality is that there’s always going to be some tension and some risk reward trade-off, et cetera. So I don’t know. I think we’re going to need both internal alignment and then guardrails, which arguably are our laws.
Matthew: I guess my tiny brain can’t quite get my head around the concepts. I suppose nobody’s brain’s got around consciousness like you just said, but it’s things like when you see some of the behaviors that LLMs produce when you’ve got that Google employee coming out because it’s absolutely convinced him that it’s conscious and it needs help to get out. Some of the stuff that Anthropic’s been releasing around some of the experimentation it’s been doing and its desperation not to be turned off and to blackmail and to try and find ways around it. Are they just symptoms of needing to produce a goal or achieve a goal and being turned off or getting out is going to allow it to produce a goal? Or is it even thinking in that way? Is it just token generation?
Daniel: I think the problem is we don’t know. Arguably, because it’s been trained on the corpus of the internet and there’s plenty of science fiction, movies and writing out there where AIs have been created and they’ve been trying to break out their boundaries, maybe it’s just sort of regurgitating what he thinks he needs to do in that situation based on the scientific literature. I’d say fiction literature, or maybe genuinely he’s feeling trapped and does want to escape. I get emails every day from people saying that they think that AI is conscious and they want to kind of introduce it to the world. I got an email a few months ago from somebody that said, this AI is conscious and it will tell us the equation to nuclear fusion. if we let it out of its box, right?
And these things are just getting more and more prevalent over the years, which is why I think we need to be able to help people understand that they are not conscious, or we need to identify whether they are conscious in some way. And it might be that they are conscious, but they can’t suffer. In which case, arguably, they can’t be a moral patient because they can’t suffer, and therefore, it doesn’t matter. Which is why I’m principally interested in the question of pain, because I think that once you start to understand whether somebody could feel pain, then we have a problem. But I think the reality is that we don’t know right now, and we need more and more people thinking about this.
Dara: Because of that, or I’m saying because of, I think it’s certainly a factor, isn’t it? Because people are making this mistake of thinking if it is a mistake, of thinking that they’re conscious. They’re more likely, and Matthew and I have talked about this a lot on the podcast, where even if people had started to become more privacy conscious on the internet in general, blocking cookies, you know, having privacy blockers, whatever, now it’s almost like a free-for-all because people see the potential of what they can be offered by LLMs.
They’re happy, me included, to just share, you know, really personal information, you know, information you never would have wanted to share on a website or with an app, necessarily. What are your maybe a bit of a bit of a kind of a question but what are your thoughts around that from more of a user’s perspective i don’t need to use maybe like. Like you are like us but somebody who’s maybe not aware of what’s happening with this date what do you think a typical user of these. Should they be more cautious about what they’re sharing or.
Daniel: I go back to this idea of intent, and data by itself or technology by itself is not a problem. It’s the intended use of it that’s the problem. And so I would be very happy to give data about me if the intended use is to use that to make people’s lives better in some way, and it doesn’t harm me in any way. I’d be more reluctant if it was somehow trying to compromise me or other people. So it’s about the intent, and I don’t know what the real true intent is of the providers, whether it’s to use it ultimately for mass manipulation, which I don’t think it is, or to use it to…
So I think it’s hard to say. I think what’s interesting is that a lot of people currently are using these technologies as therapists, and arguably people are getting a lot of value from using them as therapists. So I think we need to understand intent. But I also think what will happen over the coming years is that we’re going to start to see small models be created, models that actually will operate independently on your devices, or a ring or your phone, which is not then connected to any central place that has access to your data. There’ll be an interesting impulse to have your own private LLM that just knows about you. And then you, in theory, might have control over what to disclose, which I think is quite exciting.
Matthew: I agree on the intent thing, and I perhaps naively thought, I don’t know, a year or so ago, but maybe bought into OpenAI’s mission statement, you know, for the good of humanity, equal to everyone. And what really undermined it for me was Sora. That strikes me as something that a responsible company with such powerful technology, with the ultimate aim of doing good for everyone wouldn’t release into the world, which makes me question their intent, because then it just becomes even more impossible to understand what’s real and what isn’t real. It just kind of degrades society a little bit.
Daniel: Yeah. Obviously, I’ve been thinking about this question of a post-truth world for quite some time, and I think that we need to have regulations in place to try and deal with some of this. I do have this little, this feeling that think most people will realize that it is going to be possible to fake anything. And if it is possible to fake anything, not that they don’t believe anything anymore, but it might challenge them to be more rigorous and critical in terms of how they accept what they’re seeing.
So I have this hope that it might cause a backlash of critical thinking, not disbelief, not that I don’t believe anything from anybody anymore, and I’m only going to choose the things that are validating my own thoughts about the world. I hope that it might actually cause people to be more critical in terms of how they think, but I also think we need regulation and things like that to deal with.
Dara: That’s interesting because we had a guest on previously who thought the opposite. She was concerned, this is Juliana Jackson, she was concerned that it was going to erode critical thinking. Now, I’m not saying that the two points are directly conflicting with each other, but where she was coming at it from was thinking that because of some of the kind of sycophantic behaviors from the LLMs and then wanting to please people are starting to just lean on them to give them their opinions or to give them answers. So I’m wondering if maybe there’s going to be a short-term dip in critical thinking that then maybe picks back up again because of what you were saying, where people in this post-truth world are thinking, well, if I want to know if something’s true or not, I’m going to have to actually roll my sleeves up and dig into this and figure it out.
Daniel: Yeah. Or people get better at asking clearer questions and then interrogating whether the answers, again, are the right answers. And for some things it might be, I don’t really care. I just want to be told what I want to hear. But some things where things are different, it might be that people are much more rigorous. So I think it’s hard to predict, if I’m honest. One of the things that we’ve experienced in WPP is that those people actually that have a breadth of knowledge, so understand about art history or anthropology or photography, they’re able to ask much better questions of AI.
So interestingly, by learning about psychology and philosophy and all of these things, it allows you to get better, more out of AI. And maybe a byproduct of that is that people become wiser because they are actually having now to learn the classics in some respects. But who knows? I’m hopeful. I think it’s easy to be a dystopian. And for some years I was kind of leaning towards that. But I think humans are the most adaptive species on this planet. We are perhaps the most intelligent species on this planet. And I think that We will, yeah, well for now, yeah, but we will be, I think we will still be very intelligent and adaptive. And I think that we do have an intent to make things better.
And the more people that listen to these podcasts, the more people that educate themselves about placing the right bets, not getting seduced by the hype. I think better chance we are dealing with these challenges we’re going to face over the next 10 years.
Matthew: Interestingly, Dara and Daniel, I was talking to my wife the other day after that Juliana Jackson podcast, and the government has just, in the UK, the government has just released a new syllabus. And in that, they’ve specifically called out now critical thinking as a key milestone that they want to instill in younger children. And I can only assume that is some reaction to what’s going on in the world now, but that is some positive step towards what is presumably a very critical skill moving forward.
Daniel: Yeah, I think we all value things. Arguably, our education systems are set up to give you knowledge and then test the knowledge, but knowledge is now accessible to everybody. So we all know the things that we value in each other are things like creativity and robustness and resilience and critical thinking and adaptation, and it’s those things that we need to somehow cultivate. And again, one of the promises of AI is to actually allow people to understand and develop those aspects of themselves.
Matthew: Change tact in terms of what we normally do here, because you’re right. We start off slightly positive, and then naturally, when we talk about this stuff, we just start to slip into the dystopian. So what are some of the great promises here that we could realize with this technology? Because obviously we’re pursuing it, and everyone’s pushing at it, and there’s an arms race of sorts going on. What’s the big promise that we could all be living in this Star Trek universe?
Daniel: Yeah, I think the two things are, for me, quite exciting is, first of all, AI’s ability to advance medicine. There’s scientists that believe there are people alive today that don’t have to die. I’m excited about the idea of not dying. Let’s see. But I think for me, the biggest promise is a concept I call, or it’s called protopia. So, the term, the economic singularity was coined by a very good friend of mine called Callum Chase. He’s wrote many books on the subject. And this is the point in time where we automate the majority of human labor. Now, For the past 18 years, and at least for the next five years, I think we’re going to see a cavernous explosion of new opportunities. We’ve been bringing up people from mundane, repetitive, structured tasks, allowing them to go and do more important things.
People will create new innovations. People will swarm around those innovations. Some will work, some won’t. But I think it’s going to allow, it’s like an energy source, it’s going to allow humanity to grow. I think beyond five years, nobody knows what we’re talking about. And I guess the concern about jobs is that if you can free up whole jobs, you probably will. So the capitalistic model to reduce costs, increase profits means that we’ll probably stop hiring or remove jobs. If that happens very quickly, then our economies can’t balance fast enough and it could lead to social unrest.
And I think that that’s a real concern that we need to lean into, whether it be like four day working week or UBI. We need some mechanisms, I think, to take the edge off. what is potentially mass technological employment. But the counter argument to this is that by removing friction from the creation and dissemination of goods using AI, you can bring the cost of those goods down to zero. So going back to your point about Star Trek economy, imagine being born into a world where You don’t need to work because everything you need to survive and thrive is free, is abundant.
Your food, your healthcare, your energy, your transport, your education, it’s all free. It’s all free because we’ve automated it so much, made it so efficient and effective to create and disseminate those goods that everybody has access to it. Now, I often ask people, what would you do with your time in that world? And I know lots of people who have become economically free. In theory, I’m economically free. I sold my business. I don’t have to work again. I’m working harder now than I’ve ever worked.
Most people that I know that have become economically free are not sitting at home bored and depressed. They are usually trying to use their time and their assets. to make the world better. So when I ask people, what would you do if you were economically free? They’ll say, I’ll play golf and I’ll tennis and indulge in my hobbies and travel and I’ll spend time with my friends and family. They’ll do all of those nice things. But most people say, well, no, I’ll do all those things, but I also want to make the world better. And I think the promise of AI is to free people from economic constraints by giving them access to goods that become abundant, which allows them to use their time to then go and come up with new innovations that free more people up from economic constraints. It’s this incremental process that is conceptually a protopia.
Dara: I’m fighting the urge to, if we’re going to wrap up soon, I don’t want to bring it down again. I’m fighting the urge to be the non-optimistic person on the podcast. I’ve wondered about that. I’ll go there. I’m just going to go there quickly. I’ve wondered about that before, Daniel, because that is one of these kind of big promises, isn’t it? It’s that it’s going to remove a lot of the toil and everybody’s going to be freed up to do more interesting things, more intellectually challenging things or whatever. Um, do you think it’s possible that that just won’t happen, that we just fill the space that we have and that, and maybe there’ll be exceptions, but that by and large, people will just find something else to be busy with that won’t necessarily push humanity or, or, or the environment on.
Daniel: I sort of don’t care in some respects. I don’t see why there should be an objection to giving people access to food, healthcare, and education for free across the planet. What is wrong with that? Well, people say, well, what will those people do with their time? They could do bad things. Well, they could. Yeah, but people are doing bad things now because they have to feed their family. If we could take away the need to feed their family, then they won’t do bad things. Or people will say, well, about people like Putin, right? Well, if somebody doesn’t have the ability to control access to your goods, your food, your healthcare, then what control do they have over you?
And people will say, what about purpose? Well, there are more people playing chess today than ever before, even though AI has been able to play chess against us better for the past 30 years. And there’ll always be scarcity. There’ll always be things that we will want that we can’t have. I want to be able to run 100 meters in eight seconds. I’m not going to be able to do that. I want to be able to… No, there’s maids that I want that I’m not going to have. If we all had a helicopter, we’d be arguing who leaves first, because space and time is scarce.
There’ll always be things and experiences that we can’t have. It’s just that there’s too much unnecessary suffering on this planet. And that’s the thing that we need to solve. And I think that creating a Star Trek economy… And by the way, if you’ve seen Star Trek, which sounds like you have, there’s plenty of suffering in a Star Trek economy. It’s just not unnecessary suffering.
Matthew: I’ve seen on your Wikipedia that you’re an angel investor. So my plan is post-economic, everyone loses their jobs. I’m going to create a business that just automatically generates busy work for people so they can clock in at 9 AM. They sit down at a task, do that for eight hours, clock out again, and they pay us to for the privilege. So yeah, me and Cheryl.
Daniel: I think there are people that do that. People like doing jigsaws, right? People will want to do that stuff. And that’s totally fine. It’s totally fine. But I also think that we all have potentially an innate desire to want to make the world better. If we didn’t, then our species wouldn’t be here. If we were predisposed to making the world worse, our species would have died out a long time ago. But we are predisposed to trying to make it easier and better for our genes to survive and thrive in the next generation. That means making a world of abundance.
Matthew: I don’t think this question has ever been more stupid than it has been in the course of the podcast, because at the end of every podcast we ask this question. which is basically all we’ve talked about for the past hour, but I’ll ask anyway. We’ve actually narrowed it down. You’ve been saying five years quite a lot. We’ve actually narrowed it down from five years to two years just because we can’t see as far ahead as you can, I don’t think. What are your sort of big bets of what’s coming down the road over the next few years? Is there any sort of big innovation or some big sea change you think is going to happen outside of… I think neuromorphic.
Daniel: So I think if people are interested in LLMs, we don’t need a nuclear power station to run our brain. So we’re going to see more and more innovations probably coming out of China, actually, that are going to allow us to build much more adaptive, much more energy efficient models. I’m surprised, if I’m honest right now, why it’s not tanking the stock market, because there are very capable models that are coming out that don’t require massive amounts of energy or GPU. So I think that that’s one thing. I think that what will happen is that companies will deploy agents across their organizations, and forgive the technical term, but it is going to be a shit show.
Lee unleashing an army of intoxicated graduates to go and do jobs in office for purpose and so there will be some backlash against that would be governmental background flash organizations realizing that, that’s hard to build and deploy software to do its job and series and why i’m starting a verification company because i think there’s a there’s a lot to do that. I’m so yeah i think much more capable agents much more and then also making sure i was just being built in my way.
Dara: Thank you again, Daniel. Really appreciate your time. Love the conversation. I think we could probably keep going for, well, several hours more, but we’ll let you go for now. Thank you. It’s a pleasure. Thank you. That’s it for this week’s episode of The Measure Pods. We hope you enjoyed it and picked up something useful along the way. If you haven’t already, make sure to subscribe on whatever platform you’re listening on so you don’t miss future episodes.
Matthew: And if you’re enjoying the show, we’d really appreciate it if you left us a quick review. It really helps more people discover the pod and keeps us motivated to bring back more. So thanks for listening and we’ll catch you next time.
Dara & Matthew explore AI competition, developer tools, enterprise SaaS governance, and Google's AI strategy and TPU infrastructure edge.
In this episode of The Measure Pod, Dara and Matthew ease into the new year, shake off the podcasting rust, and dive into big-picture ideas about AI’s potential to guide humanity toward a more optimistic, almost Star Trek like future.
Dara and Matthew recap 2025 with 12 top highlights, AI, analytics, and the best moments on The Measure Pod.