Picture this: your company sits on tens of petabytes of data. To put that into perspective, if I had a penny for each byte and stacked them up, I'd have enough to reach Pluto and back, with some change left over.
That’s the reality we face at Rocket Mortgage, and it's probably not too different from what your organization is facing.
That’s why we built Rocket Analytics, and today I want to take you behind the scenes of how we created a text-to-SQL application using agentic RAG (Retrieval-Augmented Generation).
This tool fundamentally changes how our teams interact with data, letting them focus on what they do best: asking strategic and thoughtful questions, while the system handles the technical heavy lifting.
What Rocket Analytics actually does
Here’s how it works in practice: a user asks a natural language question, such as:
“Give me the count of loans for the past six months.”
Behind the scenes, the system:
- Converts the question into a SQL query
- Executes it against the relevant database
- Returns the results in a clean, understandable format
During a recent demo, someone went from raw loan counts to a comprehensive dashboard showing:
- Total loans closed
- Average daily loans
- Maximum daily loan dates
- Trend analysis
—all within seconds. For executives and stakeholders in the mortgage industry, where speed of decision-making is crucial, this capability is transformative.
The architecture behind the magic
Rocket Analytics relies on several key components, each playing a critical role in delivering accurate results.
Query transformation: Making questions understandable
The journey starts when a user inputs their question. Large language models, while powerful, don’t inherently understand our specific business context or data structures.
Ask ChatGPT, for example:
“Can you give me the top ten days for loan production?”
It won’t know exactly what you mean.
Our query transformation module converts natural language into explicit, actionable instructions. That same question becomes:
“Write a SQL query to get the top ten days based on loan closing.”
This step is crucial because we never pass actual data to the model—only metadata about our database structures.

Building and managing the knowledge base
Before queries can be answered, we need a comprehensive knowledge base. We:
- Convert database metadata into embeddings using Amazon Titan
- Store them in a FAISS (Facebook AI Similarity Search) vector store
- Include information about all tables, schemas, relationships, and business context
This knowledge base is the foundation that allows the system to understand which tables might be relevant for any given question.
Intelligent retrieval: Finding the right data sources
When a question comes in, we perform a hybrid search:
- Semantic similarity search identifies potentially relevant tables
- Keyword triggers ensure critical tables are included even if semantic search alone misses them
For example, a knowledge base with 15 tables might be narrowed down to 4–5 most relevant. This ensures the LLM is not overwhelmed and reduces errors.
Re-ranking for precision
Once candidate tables are retrieved, we use Cohere’s re-ranker model (BART-based) to refine selection. This classification system evaluates the likelihood of each document being relevant to the user’s question.
This step is essential for preventing hallucinations, ensuring the LLM generates accurate SQL queries based only on relevant metadata.
Prompt engineering: The art and science
Crafting prompts involves multiple layers:
- Standard guidelines: Always answer within context; use exact table and column names
- Adaptive guidelines: Few-shot examples for tricky question types
- Domain-specific context: Helps the model understand mortgage-industry terminology
A typical prompt looks like:
This carefully crafted prompt ensures accuracy while guiding the model through complex queries.
Execution and post-processing
Once the LLM generates a SQL query:
- Execute it against the appropriate database
- Post-process results to make them user-friendly
- Users can consume results as-is or ask follow-up questions
For instance, a user reviewing six months of loan data can then ask:
“Can you show me some insights?”
The system will automatically generate visualizations and highlight actionable trends, all within seconds.
The agentic framework: Orchestrating complexity
So far, we’ve described a single agent handling queries in one domain. But organizations often need cross-domain insights.
Our agentic framework introduces:
- A main orchestrator agent
- Sub-agents specialized in domains like sales, marketing, and operations
- Each sub-agent has its own knowledge base and optimized prompts
Currently, each agent operates within domain constraints. The next step is multi-agent coordination, enabling cross-domain queries without manual integration.
Performance optimization: Speed and cost matter
Two strategies have made a big difference:
Semantic caching
- Avoids redundant processing by reusing tables and prompts for semantically similar questions
- Requires careful tuning to handle subtleties like “past six months” vs. “past ten days”
Prompt caching
- Caches static prompt components
- Reduces computation and improves latency and cost efficiency
These optimizations help Rocket Analytics scale across teams without slowing down queries.
Measuring success: How we know it’s working
We evaluate the system with several key metrics:
- Relevance: Are retrieved tables truly relevant? Checked using LLM-as-a-judge
- Hallucination detection: Compares outputs against a golden dataset
- Toxicity screening: Ensures all responses are professional
Multiple iterations per query give a reliable performance overview and help catch edge cases.

Real-world impact: Who benefits and how?
Rocket Analytics helps teams across the organization:
Sales & Marketing
- Analyze campaign effectiveness
- Identify seasonal trends
- Compare strategies across segments
Operations
- Monitor performance metrics
- Optimize processes
Finance
- Generate reports instantly
- Analyze trends
Previously, these insights took days to produce via analysts; now they’re available in seconds.
The human element: Trust but verify
AI accelerates insight, but human oversight is critical:
- Human-in-the-loop checkpoints for high-stakes queries
- Regular audits to catch drift or emerging errors
- Ensures speed and accuracy without compromising trust
Looking ahead: The evolution continues
Our roadmap includes:
- Multi-agent coordination for cross-domain insights
- Continuous performance improvements
- Expanded domain coverage for more teams
By democratizing data access, Rocket Analytics enables any team member to make data-driven decisions, without SQL expertise.
Final thoughts
Building a text-to-SQL system with agentic RAG has taught us:
- AI can augment human intelligence, not replace it
- Speed, accuracy, and accessibility matter more than sheer complexity
- Thoughtful implementation and user-centered design are key
By removing technical barriers, Rocket Analytics empowers more people to explore, analyze, and act on data, creating a competitive advantage.
The goal isn’t to build the most sophisticated AI but to build the most useful AI.
Arjun Barli, Staff Data Scientist at Rocket Mortgage, gave this presentation at our Agentic AI Summit, New York, 2025.