Grounding
Learn what Grounding means in AI and machine learning, with examples and related concepts.
Definition
Grounding is the practice of connecting an LLM’s responses to verifiable, external sources of truth — so the model answers based on real data rather than what it “memorized” during training.
When an LLM answers a question from memory alone, it might hallucinate — generating plausible-sounding but wrong information. Grounding prevents this by anchoring responses to specific documents, databases, search results, or APIs. If the model says “Claude costs $3 per million input tokens,” grounding means there’s an actual source document backing that claim.
RAG (Retrieval-Augmented Generation) is the most common grounding technique, but grounding is the broader concept. It includes any method that ties model output to verifiable information: web search, database queries, API calls, or document retrieval.
How It Works
UNGROUNDED (risky):
User: "What's the latest Claude pricing?"
Model: [relies on training data from months ago]
Model: "Claude Sonnet costs $3 per million input tokens" ← might be outdated
GROUNDED (reliable):
User: "What's the latest Claude pricing?"
System: [retrieves current pricing page]
Model: [reads the retrieved document]
Model: "According to Anthropic's pricing page, Claude Sonnet 4.6
costs $3 per million input tokens." ← sourced from real data
Grounding Methods
1. RETRIEVAL GROUNDING (RAG)
Query → Search vector database → Relevant docs → Model reads & answers
Example: Company knowledge base Q&A
2. SEARCH GROUNDING
Query → Web search API → Search results → Model synthesizes
Example: Perplexity AI, Google Gemini with Search
3. TOOL GROUNDING
Query → Model calls API/tool → Real-time data → Model responds
Example: "What's the weather?" → calls weather API → real answer
4. DOCUMENT GROUNDING
Upload document → Model reads it → Answers only from document
Example: "Summarize this contract" with uploaded PDF
Why It Matters
- Accuracy — Grounded responses are verifiable against their sources, dramatically reducing hallucination
- Freshness — Models can access information published after their training cutoff
- Trust — Citations let users verify claims themselves
- Enterprise adoption — Most companies won’t deploy LLMs without grounding because the risk of wrong answers is too high
- Legal/compliance — In regulated industries, every AI-generated claim may need a traceable source
Example
# Grounding with web search using Perplexity-style approach
from anthropic import Anthropic
client = Anthropic()
def search_web(query: str) -> list[dict]:
"""Simulate web search — in production, use a real search API."""
# Use SerpAPI, Brave Search, or Tavily in production
return [
{
"title": "Claude Pricing - Anthropic",
"url": "https://anthropic.com/pricing",
"snippet": "Claude Sonnet 4.6: $3/M input, $15/M output. Claude Opus 4.6: $15/M input, $75/M output."
},
{
"title": "Claude API Documentation",
"url": "https://docs.anthropic.com",
"snippet": "Claude supports 200K token context windows across all models."
}
]
# Step 1: Search for relevant information
query = "What are the current Claude API pricing tiers?"
search_results = search_web(query)
# Step 2: Ground the model's response in search results
context = "\n\n".join(
f"Source: {r['title']} ({r['url']})\n{r['snippet']}"
for r in search_results
)
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=500,
temperature=0,
system="""Answer questions using ONLY the provided search results.
Cite sources with [Source Title](URL) for every factual claim.
If the search results don't contain the answer, say so.""",
messages=[{
"role": "user",
"content": f"Search results:\n{context}\n\nQuestion: {query}"
}]
)
print(response.content[0].text)
# → "According to [Claude Pricing - Anthropic](https://anthropic.com/pricing),
# Claude Sonnet 4.6 costs $3/M input tokens and $15/M output tokens..."
# Document grounding — answer only from uploaded content
def grounded_qa(document: str, question: str) -> str:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=500,
temperature=0,
messages=[{
"role": "user",
"content": f"""You are a document analysis assistant.
Answer the question based ONLY on the document below.
If the answer is not in the document, respond: "This information is not in the provided document."
--- DOCUMENT START ---
{document}
--- DOCUMENT END ---
Question: {question}"""
}]
)
return response.content[0].text
# The model can only use information from the document — no hallucination
contract = open("service_agreement.pdf.txt").read()
answer = grounded_qa(contract, "What is the termination notice period?")
Grounding vs RAG
| Grounding (concept) | RAG (technique) | |
|---|---|---|
| Scope | Broad — any source of truth | Specific — document retrieval |
| Sources | Search, APIs, databases, documents | Vector database / document store |
| Real-time | Can be (via APIs/search) | Usually not (pre-indexed) |
| Relationship | The goal | One way to achieve it |
RAG is the most popular grounding technique, but grounding also includes web search, API calls, and tool use.
Key Takeaways
- Grounding connects LLM responses to verifiable external sources, reducing hallucination
- RAG is the most common grounding technique, but web search and API calls also qualify
- Grounded responses include citations, letting users verify claims
- Essential for enterprise AI deployment — most companies require grounding for production LLM applications
- Perplexity AI and Google Gemini with Search are prominent examples of grounded AI products
Part of the DeepRaft Glossary — AI and ML terms explained for developers.