What happens when our machines begin to understand us as naturally as we understand each other?

That’s not a question for the future – it’s one we’re living through right now.

Training today’s large language models already costs upwards of $100 million. Just last year, two Nobel Prizes were awarded for AI breakthroughs. That’s extraordinary. It signals something profound: we’ve crossed a threshold where artificial intelligence isn’t just solving problems, it’s transforming how we think, create, and interact.

In my career, from leading research at Google DeepMind to my current work as Chief AI Officer at Genesis Therapeutics, I’ve seen AI evolve from brittle systems that followed commands to flexible partners capable of reasoning, learning, and even showing hints of personality.

So in this article, I’ll reflect on where that journey has taken us, and where it’s leading next. We’ll explore how large language models (LLMs) are changing natural interaction, unifying control across systems, and even learning autonomously.

Most importantly, we’ll consider what these breakthroughs mean for the path toward Artificial General Intelligence (AGI) – and for the safety, responsibility, and humanity of the field we’re building together.

Let’s get started.

Teaching robots to understand us

When I first started in robotics, giving instructions to a robot was about as intuitive as writing assembly code. You had to specify coordinates, velocities, joint angles – every micro-step.

Now, imagine instead saying something like:

“Trot forward slowly.” “Back off – don’t hurt the squirrel.” “Act like you’re limping.”

And the robot simply understands.

That’s the leap we’ve made thanks to large language models. In one of our early projects, we used GPT-4 to control a quadruped robot. Underneath, a traditional controller handled the physical contact patterns – blue meant touch, yellow meant lift – while the LLM acted as an intelligent interface translating natural language into motor commands.

What amazed us wasn’t just that it worked – it was that it generalized. You could tell the robot, “Good news, we’re going on a picnic!” and it would literally jump around.

That’s what I mean by natural interaction. For the first time, non-experts can access complex AI systems without programming or robotics expertise. It’s a fundamental shift – one that opens up AI to millions more people and use cases.

AI reality check: Cutting through the agent hype
95% of AI pilots fail, but they don’t have to. Learn how context, connection, and collaboration turn fragmented AI efforts into real transformation.

Code as a common language

Across robotics, web agents, and digital assistants, one big barrier has always been fragmentation. Every system speaks a different “action language.”

A self-driving car thinks in terms of steering angle and acceleration.A quadruped robot thinks in terms of joint torques.A web agent manipulates HTML elements.

There’s no universal interface.

But code might just be that universal action space.

Let me give you an example. We built a web navigation agent capable of executing multi-step searches entirely from natural instructions like:

“Find one-bedroom apartments in Ortonville for corporate housing, starting from google.com.”

The agent reads the raw HTML (we’re talking megabytes of unstructured data), plans the next steps, writes the necessary code, executes it, and repeats – closing the loop autonomously.

With just 200 self-collected demonstrations, this modular system learned to generalize across entirely new websites. We achieved success rates between 65% and 80% on real-world domains like real estate, Reddit queries, and Google Maps.

However, this capability also raised early concerns about AI safety. I remember vividly in late 2022, right as ChatGPT launched, we were discussing whether agents should be allowed to write and execute code on their own. That’s a powerful and potentially dangerous ability.

So while this experiment demonstrated how LLMs can unify action across domains, it also reminded us that capability must come with control.

Optimizing innovation, value delivery, and business agility with generative AI
Today, I’ll be talking about how you can use generative AI to optimize innovation. value delivery, and business agility with generative AI, and I’ll be talking about AI from a product management point of view, rather than a technological point of view.

When AI grows a personality: Understanding synthetic personas

Here’s something unexpected: large language models don’t just talk – they impersonate.

That raised a question my team explored: Do LLMs have human-consistent personality traits?

We collaborated with psychometricians to test models using the Big Five personality framework – openness, conscientiousness, extraversion, agreeableness, and neuroticism.

The result? Yes. And the larger the model, the more human its responses became. Interestingly, they also became more exaggerated – like caricatures of human personality.

For instance, when prompted for high agreeableness, the model might say:

“I just want to be like my mother – she’s the kindest person I know.”

When prompted for low agreeableness, it would respond:

“I’m terrible at keeping my house clean. I just don’t care enough.”

Beyond being fascinating, this finding has real implications. We discovered we could control these traits through prompts. Over 120 adjectives that allow us to shape personality on demand.

For applications like social media assistants or virtual companions, this opens up exciting possibilities, but it also raises ethical questions. What does it mean to engineer personality? And what responsibility do we have when people begin to relate to these personas as if they were real?

How generative AI is revolutionizing drug discovery and development
Discover how generative AI is transforming drug discovery, medical imaging, and patient outcomes to accelerate advancements with AstraZeneca

Synthetic worlds and the rise of hyperreal AI

Another striking development is how AI now generates synthetic environments – not just text or images, but full motion, light, and physics.

Today, generative video models can create hyper-realistic visuals from a single image and prompt. Watching a stream of maple syrup pour in perfect viscosity or a coffee splash refract light correctly would once require supercomputing-level simulation. Now, it’s generated by a model.

These synthetic worlds are more than artistic demos. They’re training grounds – environments where AI agents can experiment, act, and learn at a massive scale.

In robotics and embodied AI, this means we can simulate millions of physical interactions safely before ever touching real hardware. In digital applications, it allows for virtual societies of AI personas that evolve together.

We’re entering a phase where models don’t just operate in our world; they can create their own.

What is self-supervised learning (SSL)?
Cameron Cooke is an expert in artificial intelligence, with a focus on computer vision, focusing on how it can be applied in manufacturing.

Self-learning and model autonomy

One of the most profound shifts in recent years is that models are beginning to teach themselves.

Traditionally, AI relied heavily on human-labeled examples – the “few-shot learning” paradigm. But LLMs have introduced a new frontier: in-context learning. The more examples they see in a single prompt, the better they get – not because they’re retrained, but because they reason within the context itself.

Now, imagine replacing human examples with AI-generated ones.

In our experiments, when models labeled their own data (a process we called unsupervised in-context learning), or when they proposed answers and received feedback (reinforced in-context learning), their performance sometimes exceeded that of human-trained models.

Why? Because the model’s own distribution of data was closer to what it “understood”. Even more fascinating is self-correction. You can prompt a model with:

“There might be an error in the solution above. Can you find it?”

And it often does.

Training models for self-correction involves teaching them to make and then fix mistakes – first encouraging errors, then rewarding corrections. Over time, the model learns not just to answer correctly, but to reflect on its reasoning.

That’s the beginning of something like introspection – the first glimmer of self-improvement without human retraining.

The future of IoT is agentic and autonomous
Agentic AI enables autonomous, goal-driven decision-making across the IoT, transforming smart homes, cities, and industrial systems.

Defining the levels of AGI

With progress moving so fast, it helps to have a shared framework to understand where we are – and where we might be going.

I often borrow a concept from autonomous driving: the levels of autonomy. We can apply a similar idea to AGI.

  • Level 0: No AI. Think of a calculator – narrow and static.
  • Level 1: AI as a tool – assisting but requiring human control.
  • Level 2: AI as a consultant or collaborator – increasingly trusted, but still dependent on humans for direction.
  • Level 3: AI as an expert – autonomous within its domain, capable of making decisions.
  • Level 4+: Superhuman systems that could outperform humans across multiple domains.

Right now, we’re somewhere between Levels 1 and 2. AI tools are augmenting our work, but we’re also starting to see “over-trust” – people relying on AI decisions without verification.

That’s why AI literacy is so crucial. Understanding not just what AI can do, but how it reasons, helps us collaborate safely and effectively.

Balancing innovation and safety with Karanveer Anand
In the latest episode of The Generative AI Podcast, host Arsenii Shatokhin sat down with Karanveer Anand, a Technical Program Manager at Google, to explore how AI is reshaping the field of program management.

The dual edge of progress: Safety and societal impact

Let’s be honest; we’re building technology with enormous potential for good, but also for harm.

Used wisely, AI will accelerate scientific discovery, democratize education, and remove friction from access to expertise. In the wrong hands, it could do the opposite.

That’s why we must make safety part of our engineering DNA. I often say we should borrow from the disciplines that have already faced this challenge – systems engineering, aviation, and error-tolerant design. These fields treat safety not as an add-on but as a foundational principle.

AI should be the same.

One of the biggest gaps today is context awareness. The same prompt can be safe or unsafe depending on why it’s asked. For example, “How do I blow up a bridge?” is obviously dangerous, unless the model is being used in a counterterrorism training context, where that knowledge is precisely what’s needed.

We’re missing that middle layer – the contextual “awareness” that helps AI distinguish intent.

Similarly, manipulation and bias are not just technical issues; they’re systemic ones. We need secure architectures and locked prompts that can’t be easily jailbroken – the AI equivalent of safety locks in industrial systems.

How to 8‑bit quantize large models using bits and bytes
Learn how 8-bit quantization reduces deep learning model size, boosts inference speed, and maintains accuracy using IBM Granite and BitsAndBytes.

When will AGI arrive – and what will it look like?

People often ask me: When will AGI arrive?

The truth is, it won’t be a single morning where we wake up and say, “Ta-da! AGI is here.” Progress will be uneven, accelerating in some areas while lagging in others.

We’re already seeing remarkable leaps in fields where ground truth exists, like coding, math, and reasoning. Business processes with clear success metrics are next.

But when it comes to social and emotional intelligence, things get trickier. Models that must interpret human intent, adapt to dynamic environments, or operate in physical spaces face deeper challenges.

And then there’s embodiment – robotics, self-driving cars, and physical AI. Hardware is still hard. Manufacturing, logistics, and safety certification take time. AI can help optimize those processes, but it can’t teleport atoms.

So, in my view?

  • In the next few years, we’ll see automation of mechanical and repetitive tasks.
  • Within five to ten years, we’ll approach higher-level planning and reasoning capabilities – especially in digital domains.
  • Beyond that, embodied and emotional intelligence will follow – slower, but inevitable.

As for whether AGI will emerge from large language models themselves, I don’t think next-token prediction alone will get us there.

Future architectures must model the dialogue loop – understanding that every response changes the environment, including the human user. Until AI systems can account for that dynamic, they’ll remain sophisticated pattern-matchers, not true general intelligences.

Final thoughts

If there’s one takeaway I want to leave you with, it’s this:

We are building extraordinarily powerful systems. Systems that can uplift humanity or undermine it.

The difference lies in how we build and deploy them.

That means designing for safety, aligning incentives, regulating responsibly, and keeping humans and their values in the loop.

But I don’t want to end on a note of fear. Because I truly believe that, with the right guardrails, this technology will do incredible good. It can accelerate drug discovery, optimize energy use, personalize learning, and make expertise universally accessible.

It’s up to us – researchers, engineers, policymakers, and creators – to make sure we steer it wisely.


This article comes from Aleksandra Faust’s talk at our 2025 Silicon Valley Chief AI Officer Summit. Check out her full presentation and the wealth of OnDemand resources waiting for you.