In 2017, Aidan Gomez, then a third-year student at the University of Toronto, began an internship at Google Brain — the tech group’s artificial intelligence research team focused on deep learning. There, he worked with Lukasz Kaiser on creating a neural network capable of training machines to learn to do “everything”, rather than using different models to learn different tasks.

He also worked alongside Noam Shazeer — now the chief executive of Character AI — as he was developing language and attention-based models. Attention is the ability to learn relationships between two elements in a sequence, and one of the key observations that led to AI generating text that makes sense.

In 2019, Gomez, with colleagues Ivan Zhang and Nick Frosst, co-founded Cohere AI to develop large language models for business, building on the research he had worked on at Google. He has recently secured $270mn of new funding from backers including Nvidia, Oracle and Salesforce Ventures, which values Cohere at around $2bn. Here, Gomez talks to FT venture capital correspondent George Hammond about where he sees the real potential — and the real risks — in generative AI, including OpenAI’s ChatGPT chatbot.


George Hammond: Google seemed to have been caught a little bit cold by the launch of ChatGPT last year. Why do you think that was? Are they in a position to catch up?

Aidan Gomez: I can talk about my time inside of Google. Google Brain was the hub of excellence in AI. It was the best laboratory for AI that existed on the planet and everything that people saw publicly had been brewing internally, quietly, for a very long time.

I think there were — and remain — some really extraordinary folks. During the research phase of this technology, during the tinkering and discovery phase, Google Brain was the best place you could possibly be. I think we’ve now moved into a phase where it’s about building real products and experiences with the technology that we developed. And that was where Google was not the best place to be.

Tech Exchange 

The FT’s top reporters and commentators hold monthly conversations with the world’s most thought-provoking technology leaders, innovators and academics, to discuss the future of the digital world and the role of Big Tech companies in shaping it. The dialogues are in-depth and detailed, focusing on the way technology groups, consumers and authorities will interact to solve global problems and provide new services. Read them all here

For me, personally, and for my co-founders — because Nick [Frosst] was also at Brain — to deliver on what we wanted to deliver on, it had to be done independently. It had to be done through Cohere.

There are some things that big companies are just not good at and I think a start-up environment allows you the freedom, flexibility, the speed to accomplish.

GH: I went to see Mustafa Suleyman at Inflection AI speak a couple of days ago and he was talking about that ChatGPT moment. His line on it was: we were playing with a ChatGPT equivalent two years ago within Google and — for whatever reason — there was just this hesitation about putting it into consumers’ hands. I don’t know what the culture is within Google but it’s very striking. The technology was there but the willingness to pump it out in the way that OpenAI did was, perhaps, not.

AG: I think that’s very accurate. The technology ambition was immense and the technology capacity, in terms of talent in compute, was immense. The product ambition . . . was absent.

GH: Let’s talk about that capacity point. How do you maintain a head start? I know Google has had this leaked paper on ‘Where is our moat?’ What is your answer to that? Is it a question of more compute, better talent, greater resourcing or is there a different way of staying ahead?

AG: Is there a moat? Increasingly, I’m realising that what Cohere does is as complex a system as the most sophisticated engineering projects humans have taken on — like rocket engineering. You have this huge machine that you’re building. It consists of tons of different parts and sensors. If one team messes up by a hair, your rocket blows up: that’s the experience of building these models.

They’re so extraordinarily sensitive. They come from this pipeline that involves 10 or 12 stages. Each stage depends on the one before it and, if one person makes the slightest mistake as the model is passing through this pipeline, the whole thing collapses.

The moat comes from the fact that this is one of the most complicated things humans have ever done. You really do need to invest extraordinary capital to do this thing well. It’s what it takes. So, I think there’s a very strong moat, just from the fact it is such a difficult project to accomplish with such specialised knowledge.

Google’s Bay View campus in Mountain View, California, where Aidan Gomez was an intern © David Paul Morris/Bloomberg

GH: A decade from now, where do you think AI is going to be popping up in our daily lives, beyond the plaything it is at the moment?

AG: I think in the same way that most of us probably spend the majority of our time on a device, on a screen — whether it’s your phone or your laptop or TV — a big chunk of your time is going to be spent communicating with these models. They are going to be your interface to the world.

If you want to get some shopping done, you’ll just talk to a bot and say, ‘Hey, can you get this delivered to my place?’ and it goes off and does it. ‘Hey, can you book a place for my wife and I in Hawaii in three weeks?’ You can just express your preferences or, even better, the bot already knows your preferences because you told it six months ago when you booked your last trip to Hawaii.

I think you’re going to be interacting with these models as frequently as you interact with your cell phone. It will become the primary means by which you accomplish tasks online, get stuff done, because it’s so much nicer than having to Google Search something and then click through 30 links. It’s just so much nicer to have an assistant do that for you. So, everyone will get that assistant and it will become just a standard part in our lives in the same way we carry around this piece of glass and sand.

GH: So far we’ve had this [generative AI] as a toy rather than as a tool. And it’s fine if you’re using ChatGPT and it hallucinates [generates unexpected, untrue results] and a poem comes out slightly sideways. But it’s less fine if this tool is digesting medical records and kicking out diagnoses. How do you get around that hallucination problem, particularly in your business?

AG: First and foremost, I think we have to remember that the technology is super early. We’re still in the first few days of this technology and so it will become increasingly robust over time. The other thing to remember is that humans will be in the loop for critical paths — potentially, forever.

There will be some things that we just fundamentally don’t feel comfortable taking a human out of the loop for — and maybe that is something like medicine. I actually think that medicine is a place where this can be completely transformational and improve patient outcomes massively because there is so much human error in that system. My grandmother died from human error in the healthcare system.

Aidan Gomez, co-founder and CEO of Cohere, in Palo Alto, California © Jeff Chiu/AP

I think there’s tons of room for positive change but, in the early days, it’s going to look like augmentation, not replacement. It’s going to look like doctors assisted by models, models giving doctors feedback: ‘Have you thought about this? Are you missing this? Should we run this test?’ So, it’s going to be an assistant to humans, as opposed to an end-to-end replacement for them.

For high stakes applications, that’s going to be the case. But I think, in general, the reliability of these models is only going to be increased and the unreliability that you see today is a product of the fact that it’s still so, so early. I’m actually shocked by how reliable the [results] are given how recent the technology is.

Hallucination is one [concern] that is coming up a lot. These models just make up facts. They are just fabricating stuff. [But] there are concrete steps to fix that. The first one is recognising that, sometimes, there is good hallucination, like what you described with a poem. You want it to be creative. You want it to make up novel, random stuff. That is hallucination and it is desirable in that case.

But if I’m asking you, hey, could you give me the state of the art on whether gravitons exist, I don’t want you to make up an answer and fabricate something. I really want you to be grounded and truthful and I want you to research it and give me a reliable answer. That’s something called retrieval-augmented generation. It’s where models go out into some knowledge space. Maybe it’s like the internet, maybe it’s some proprietary knowledge. Maybe it’s a journal or papers. The models can go out and they can query and search that knowledge base, read all the documents that are relevant and then come back with their fully formed answer.

What you can do there is force the model to actually cite its sources. Now, you have the ability to verify these models . . . you can actually read what the model read and say, ‘Oh yes, that actually does say that. Yes, there was a recent paper that said such and such’.

Retrieval augmentation is going to be a huge piece of faithfulness and our ability to trust these models, because we can verify them. That method was created by a guy called Patrick Lewis when he was at Meta. He now leads Cohere’s retrieval effort. These core components, they exist, they’re known already. What we need to do is integrate them back into our lives, bring that into the loop. It just takes time to do that.

GH: There was a very important qualifier in your point about job loss: “in the early days” [AI] would augment humans. At what point is it going to start supplanting them? And how deep is that replacement going to be?

AG: It really depends on the use case. There will be some tasks that humans currently carry out that I think get completely replaced, totally. Then, there are others that will never get replaced at all.

Text generated by ChatGPT, the AI chatbot developed by OpenAI © Florence Lo/Reuters

GH: What’s an example of the former?

AG: Something that gets completely replaced? I think that you’re going to get a lot of your customer service done by speaking to an intelligent agent. I think that a lot of that can be done by exchanging information and a request between you and some large language model which has access to the ability to resolve common issues for you.

So, for instance, changing a credit card or an address or returning an item or submitting a complaint. I think a lot of that is going to be able to be handled by these models in an end-to-end fashion. Even then, there are some things that are not going to be able to be handled — either because the model doesn’t have access to the system in the company that can effect the change to resolve the [customer’s] query, or because, let’s say, the user is just so incensed, so emotional, that they need a human there to manage that. So, that role will not evaporate completely. But the scale of humans necessary will dramatically decrease.

GH: [We’re now at] the sharp end of the conversation around regulation in AI, so I’m interested in your view on whether there is a case, as [Elon] Musk and others have advocated, for stopping things for six months and trying to get a handle on it. Or, as Microsoft’s chief economist [said] last month, let’s regulate once we’ve seen demonstrable harm come from AI. Where do you stand on that?

AG: I think the six-month pause letter is absurd. It is just categorically absurd.

GH: Because of the argument, or because of the signatories — or both?

AG: The argument and how it is framed, and also just the proposal. How would you implement a six-month clause practically? Who is pausing? And how do you enforce that? And how do we co-ordinate that globally? Is it just United States companies that pause? It makes no sense. The request is not plausibly implementable. So, that’s the first issue with it.

The second issue is the premise: there’s a lot of language in there talking about a superintelligent artificial general intelligence (AGI) emerging that can take over and render our species extinct; eliminate all humans. I think that’s a super dangerous narrative. I think it’s irresponsible. Some of these signatories have been calling for absurd positions. There’s a guy, [Eliezer] Yudkowsky, who is calling for the United States to be willing to militarily bomb data centres in adversary nations who don’t comply with slowing things down, and pausing things.

That’s really reckless and harmful and it preys on the general public’s fears because, for the better part of half a century, we’ve been creating media sci-fi around how AI could go wrong: Terminator-style bots and all these fears. So, we’re really preying on their fear.

Elon Musk, owner of Twitter, has changed the social media platform’s user verification system © Jordan Vonderhaar/Bloomberg

GH: Are there any grounds for that fear? When we’re talking about those signatories that are talking about the development of AGI and a potential singularity moment, is it a technically feasible thing to happen, albeit improbable?

AG: I think it’s so exceptionally improbable. There are real risks with this technology. There are reasons to fear this technology, and who uses it, and how. So, to spend all of our time debating whether our species is going to go extinct because of a takeover by a superintelligent AGI is an absurd use of our time and the public’s mindspace.

There’s real stuff we should be talking about. One, we can now flood social media with accounts that are truly indistinguishable from a human, so extremely scalable bot farms can pump out a particular narrative. We need mitigation strategies for that. One of those is human verification — so we know which accounts are tied to an actual, living human being so that we can filter our feeds to only include the legitimate human beings who are participating in the conversation. That’s one risk.

There are other major risks. We shouldn’t have reckless deployment of end-to-end medical advice coming from a bot without a doctor’s oversight. That should not happen. That’s just not the right way to deploy these systems. That’s not safe yet. They’re not at that level of maturity where that’s an appropriate use of them.

So, I think there are real risks and there’s real room for regulation. I’m not anti-regulation, I’m actually quite in favour of it. But I would really hope that the public knows some of the more fantastical stories about risk [are unfounded]. They’re distractions from the conversations that should be going on.

GH: I want to get into those risks in a bit more detail. You’re talking about removing human verification and the dangers of that. Elon Musk has laid off whole teams of people doing that at Twitter. Does it concern you that a person with that attitude is now investing very heavily into his own AI research?

AG: I don’t know Elon personally, so I know the same public persona that I think everybody else knows and his posturing towards AI. He has been very focused on the safety of it and making sure that it gets used appropriately. And he’s critiqued others for deploying early and doing it in a way that’s reckless. From the public persona that I’ve known and heard, he seems to take it quite seriously. I didn’t know about that dismissal of human verification.

GH: Well . . . he’s reportedly cut 90 per cent of the staff at Twitter. His argument would be: by making people pay for a blue tick [on their Twitter profiles], you’re linking it to a credit card and payment, and therefore it’s a form of human verification. But I think he’s taken it in a slightly different direction to what was in place.

AG: I do think it’s really good that he’s prioritised some form of verification by that blue tick thing. You can complain about the price or whatever — maybe it should be free if you upload your driver’s licence, or something. But I do think it’s really important that we have some degree of human verification on all our major social media.

I think he’s done, frankly, good for Twitter on that front. I don’t know him personally, so I can’t say much about his positioning on this stuff and the actions he’s taken, whether they’re consistent with my impression, but I do think that blue tick verification on Twitter was quite good . . . the way he talks, he seems to take safety quite seriously.

GH: You mention misinformation. I’m conscious that, next year, in the US, there’s a general election and, already, I think Republicans have put out [an AI-generated] attack ad of Biden. Have you seen this? It’s all artificially generated but it’s tanks on the streets and the White House crumbling. The content is unreal but the visuals are obviously completely perfect. How much danger is there that this undermines democratic processes in the way that social media has already started to do in some election cycles? Is the answer regulation in the short-term?

AG: I think that humans are largely the opinions and the content that they see. Things get normalised just by exposure, exposure, exposure, exposure. So, if you have the ability to just pump people the same idea again and again and again, and you show them a reality in which it looks like there’s consensus — it looks like everyone agrees X, Y and Z — then I think you shape that person and what they feel and believe, and their own opinions. Because we’re herd animals.

We align with each other and, when we see the group has made a decision, we align with the group. So, I think if there weren’t these countermeasures towards synthetic media that can just be pushed to the front of people’s feeds, if there weren’t those countermeasures, I would be a lot more concerned. I actually feel fairly optimistic that there are human verification methods for mitigating that risk.

GH: How much time do you spend thinking about whether the technologies you’re developing are the right things to be creating, as opposed to just the permissible things to be creating?

AG: I spend a lot of time thinking about both the opportunities of the underlying technology and the risks. It’s a very general technology. It’s literally a machine where you can ask it to do something and it does it. So, in the same way that you could hire someone and ask them to do something and they do it, that is a very wide space of possible applications and use cases and scenarios where you might want to place it.

Due to that breadth, that generality, you get a lot of scenarios that you don’t want to deploy AI in. You get a lot of risks. I care a lot about ensuring that this goes well and the people who use Cohere’s models use them for the right use cases, and that the effect of the models is, overall, extremely positive.

So, you do what you can. You try to add friction points into misuse. You add safeguards, like monitoring tools which tell you if someone has signed up and is starting to generate content which looks problematic or toxic, and you flag that extremely quickly and you kick them off the platform and ban their IP. You do put up walls to make it as hard as possible for someone who wants to misuse this technology to do that.

I have a high degree of confidence that everyone in the field, everyone in the space, is hyperconscious of the risks and are investing a lot of their time in their mitigation.

GH: I was very struck by the analogy you used when you compared the engineering process behind AI to rocket development. If something goes wrong by a hair’s breadth then you have an explosion, and it is not an idle analogy to be using. There is a huge responsibility on you as the people at the forefront of this technology. I appreciate all the safeguards you’ve just described but is there really a way that you can control the outcome?

AG: I think there is. You make very conscious decisions about what sort of use you’ll tolerate with your models and you are empowered to enforce those rules and those terms. I don’t think you just build a model, and then release it into the world, and just hope things go well.

It’s really not like that. We deploy in much more controlled scenarios — specifically, with enterprises where we know exactly what they’re using it for. It’s very high trust, very high visibility into what they’re using it for. So, we’re able to intervene. We aren’t just helpless in how these models get used.

I think it’s very different if you open-source a model. Then, you’d better hope you’ve sufficiently mitigated misuse within that model because you don’t know what it’s used for — and you don’t know who is using it.

GH: That feels like an enormous danger, given we do have pretty sophisticated open-source models out there already, right?

AG: Yes. I totally agree.

GH: Is there a specific place for regulation on that front?

AG: Maybe people view open-source models as Cohere’s competition . . . so I would come across as disingenuous pushing for regulation of open-source models. I think the world should make a decision on what sort of risks they’re willing to tolerate.

I come from the open-source community. I am a researcher of machine learning and I benefited massively from open research and the release and communication and distribution of knowledge. So I really appreciate that. I also have a very high degree of appreciation for the potential of this technology to do both good and harm. Making it very, very, very easy to do harm seems like a very bad idea.

Copyright The Financial Times Limited 2024. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Follow the topics in this article

Comments