Why do we need autonomous AI agents?

Picture this: it’s 3 a.m., and a customer on the other side of the globe urgently needs help with their account. A traditional chatbot would wake up your support team with an escalation. But what if your AI agent could handle the request autonomously, safely, and correctly? That’s the dream, right?

The reality is that most AI agents today are like teenagers with learner’s permits; they need constant supervision. They might accidentally promise a customer a large refund (oops!) or fall for a clever prompt injection that makes them spill company secrets or customers’ sensitive data. Not ideal.

This is where Double Validation comes in. Think of it as giving your AI agent both a security guard at the entrance (input validation) and a quality control inspector at the exit (output validation). With these safeguards at a minimum in place, your agent can operate autonomously without causing PR nightmares.

How did I come up with the Double Validation idea?

These days, we hear a lot of talk about AI agents. I asked myself, "What is the biggest challenge preventing the widespread adoption of AI agents?" I concluded that the answer is trustworthy autonomy. When AI agents can be trusted, they can be scaled and adopted more readily. Conversely, if an agent’s autonomy is limited, it requires increased human involvement, which is costly and inhibits adoption.

Next, I considered the minimal requirements for an AI agent to be autonomous. I concluded that an autonomous AI agent needs, at minimum, two components:

  1. Input validation – to sanitize input, protect against jailbreaks, data poisoning, and harmful content.
  2. Output validation – to sanitize output, ensure brand alignment, and mitigate hallucinations.

I call this system Double Validation.

Given these insights, I built a proof-of-concept project to research the Double Validation concept.

In this article, we’ll explore how to implement Double Validation by building a multiagent system with the Google A2A protocol, the Google Agent Development Kit (ADK), Llama Prompt Guard 2, Gemma 3, and Gemini 2.0 Flash, and how to optimize it for production, specifically, deploying it on Google Vertex AI.

For input validation, I chose Llama Prompt Guard 2 just as an article about it reached me at the perfect time. I selected this model because it is specifically designed to guard against prompt injections and jailbreaks. It is also very small; the largest variant, Llama Prompt Guard 2 86M, has only 86 million parameters, so it can be downloaded and included in a Docker image for cloud deployment, improving latency. That is exactly what I did, as you’ll see later in this article.

The complete code for this project is available at github.com/alexey-tyurin/a2a-double-validation

How to build it?

The architecture uses four specialized agents that communicate through the Google A2A protocol, each with a specific role:

The architecture uses four specialized agents that communicate through the Google A2A protocol
Image generated by author

Here's how each agent contributes to the system:

  1. Manager Agent: The orchestra conductor, coordinating the flow between agents
  2. Safeguard Agent: The bouncer, checking for prompt injections using Llama Prompt Guard 2
  3. Processor Agent: The worker bee, processing legitimate queries with Gemma 3
  4. Critic Agent: The editor, evaluating responses for completeness and validity using Gemini 2.0 Flash

I chose Gemma 3 for the Processor Agent because it is small, fast, and can be fine-tuned with your data if needed — an ideal candidate for production. Google currently supports nine (!) different frameworks or methods for finetuning Gemma; see Google’s documentation for details.

I chose Gemini 2.0 Flash for the Critic Agent because it is intelligent enough to act as a critic, yet significantly faster and cheaper than the larger Gemini 2.5 Pro Preview model. Model choice depends on your requirements; in my tests, Gemini 2.0 Flash performed well.

I deliberately used different models for the Processor and Critic Agents to avoid bias — an LLM may judge its own output differently from another model’s.

Let me show you the key implementation of the Safeguard Agent:

Plan for actions

The workflow follows a clear, production-ready pattern:

  1. User sends query → The Manager Agent receives it.
  2. Safety check → The Manager forwards the query to the Safeguard Agent.
  3. Vulnerability assessment → Llama Prompt Guard 2 analyzes the input.
  4. Processing → If the input is safe, the Processor Agent handles the query with Gemma 3.
  5. Quality control → The Critic Agent evaluates the response.
  6. Delivery → The Manager Agent returns the validated response to the user.

Below is the Manager Agent's coordination logic:


Time to build it

Ready to roll up your sleeves? Here's your production-ready roadmap:

Local deployment

1. Environment setup 

2. Configure API keys 

3. Download Llama Prompt Guard 2 

This is the clever part – we download the model once when we start Agent Critic for the first time and package it in our Docker image for cloud deployment:

Important Note about Llama Prompt Guard 2: To use the Llama Prompt Guard 2 model, you must:

  1. Fill out the "LLAMA 4 COMMUNITY LICENSE AGREEMENT" at https://huggingface.co/meta-llama/Llama-Prompt-Guard-2-86M
  2. Get your request to access this repository approved by Meta
  3. Only after approval will you be able to download and use this model

4. Local testing 

Screenshot for running main.py

 

Image generated by author


Screenshot for running client

 

Image generated by author

Screenshot for running tests

 

Image generated by author


Production Deployment 

Here's where it gets interesting. We optimize for production by including the Llama model in the Docker image:

1. Setup Cloud Project in Cloud Shell Terminal

  1. Access Google Cloud Console: Go to https://console.cloud.google.com
  2. Open Cloud Shell: Click the Cloud Shell icon (terminal icon) in the top right corner of the Google Cloud Console
  3. Authenticate with Google Cloud:

  1. Create or select a project:

  1. Enable required APIs:

3. Setup Vertex AI Permissions

Grant your account the necessary permissions for Vertex AI and related services:

3. Create and Setup VM Instance

Cloud Shell will not work for this project as Cloud Shell is limited to 5GB of disk space. This project needs more than 30GB of disk space to build Docker images, get all dependencies, and download the Llama Prompt Guard 2 model locally. So, you need to use a dedicated VM instead of Cloud Shell.

4. Connect to VM

Screenshot for VM 

Image generated by author

5. Clone Repository

6. Deployment Steps

Screenshot for agents in cloud 

Image generated by author

7. Testing 

Screenshot for running client in Google Vertex AI

Image generated by author

Screenshot for running tests in Google Vertex AI

Image generated by author

Alternatives to Solution

Let's be honest – there are other ways to skin this cat:

  1. Single Model Approach: Use a large LLM like GPT-4 with careful system prompts
    • Simpler but less specialized
    • Higher risk of prompt injection
    • Risk of LLM bias in using the same LLM for answer generation and its criticism
  2. Monolith approach: Use all flows in just one agent
    • Latency is better
    • Cannot scale and evolve input validation and output validation independently
    • More complex code, as it is all bundled together
  3. Rule-Based Filtering: Traditional regex and keyword filtering
    • Faster but less intelligent
    • High false positive rate
  4. Commercial Solutions: Services like Azure Content Moderator or Google Model Armor
    • Easier to implement but less customizable
    • On contrary, Llama Prompt Guard 2 model can be fine-tuned with the customer’s data
    • Ongoing subscription costs
  5. Open-Source Alternatives: Guardrails AI or NeMo Guardrails
    • Good frameworks, but require more setup
    • Less specialized for prompt injection

Lessons Learned

1. Llama Prompt Guard 2 86M has blind spots. During testing, certain jailbreak prompts, such as:

And

were not flagged as malicious. Consider fine-tuning the model with domain-specific examples to increase its recall for the attack patterns that matter to you.

2. Gemini Flash model selection matters. My Critic Agent originally used gemini1.5flash, which frequently rated perfectly correct answers 4 / 5. For example:

After switching to gemini2.0flash, the same answers were consistently rated 5 / 5:

3. Cloud Shell storage is a bottleneck. Google Cloud Shell provides only 5 GB of disk space — far too little to build the Docker images required for this project, get all dependencies, and download the Llama Prompt Guard 2 model locally to deploy the Docker image with it to Google Vertex AI. Provision a dedicated VM with at least 30 GB instead.

Conclusion

Autonomous agents aren’t built by simply throwing the largest LLM at every problem. They require a system that can run safely without human babysitting. Double Validation — wrapping a task-oriented Processor Agent with dedicated input and output validators — delivers a balanced blend of safety, performance, and cost. 

Pairing a lightweight guard such as Llama Prompt Guard 2 with production friendly models like Gemma 3 and Gemini Flash keeps latency and budget under control while still meeting stringent security and quality requirements.

Join the conversation. What’s the biggest obstacle you encounter when moving autonomous agents into production — technical limits, regulatory hurdles, or user trust? How would you extend the Double Validation concept to high-risk domains like finance or healthcare?

Connect on LinkedIn: https://www.linkedin.com/in/alexey-tyurin-36893287/  

The complete code for this project is available at github.com/alexey-tyurin/a2a-double-validation

References

[1] Llama Prompt Guard 2 86M, https://huggingface.co/meta-llama/Llama-Prompt-Guard-2-86M

[2] Google A2A protocol, https://github.com/google-a2a/A2A 

[3] Google Agent Development Kit (ADK), https://google.github.io/adk-docs/