An agent’s parametric knowledge (what the LLM was trained on) has a cutoff date and gaps. For tasks involving current events, live data, or domain knowledge not in the knowledge base, the agent needs to reach the web.
Tavily is an AI-optimised search API. Unlike general search APIs that return raw HTML, Tavily returns cleaned, summarised content designed for LLM consumption. This means:
Your TAVILY_API_KEY is pre-configured as a Codespace environment variable — no manual setup required. The notebook loads it automatically via os.environ.get("TAVILY_API_KEY").
search_tavily as a ToolThe @toolbox.register_tool(augment=True) decorator does two things:
augment=True, embeds the function’s docstring and stores it in TOOLBOX_MEMORY — so the agent can find this tool via semantic searchThe docstring matters. It is what the agent reads to decide whether to call this tool. Write it as if you are explaining to the agent when and why to use it.
Important: The function must be named search_tavily — the agent harness references this name to trigger a context refresh after web searches.
Complete solution:
from tavily import TavilyClient
from datetime import datetime
tavily_client = TavilyClient(api_key=tavily_api_key)
@toolbox.register_tool(augment=True)
def search_tavily(query: str, max_results: int = 5):
"""
Use this function to search the web and store the results in the knowledge base.
"""
response = tavily_client.search(query=query, max_results=max_results)
results = response.get("results", [])
for result in results:
text = f"Title: {result.get('title', '')}\nContent: {result.get('content', '')}\nURL: {result.get('url', '')}"
metadata = {
"title": result.get("title", ""),
"url": result.get("url", ""),
"score": result.get("score", 0),
"source_type": "tavily_search",
"query": query,
"timestamp": datetime.now().isoformat()
}
memory_manager.write_knowledge_base(text, metadata)
return results
Why it writes to the knowledge base: Unlike a simple search that returns and forgets, this tool persists every result into SEMANTIC_MEMORY. On future turns, the agent can retrieve these results via read_knowledge_base without searching again — the web content becomes part of the agent’s long-term memory.
After registering the tool, the next cell demonstrates semantic tool retrieval:
retrieved_tools = memory_manager.read_toolbox("Search the internet")
This query searches TOOLBOX_MEMORY semantically and returns tools whose descriptions are relevant to the query. The agent uses this same mechanism on each turn to decide which tools to include in its context — instead of being given all tools every time.
This is the toolbox pattern: register tools into a vector store, retrieve relevant tools per query. At scale (hundreds of tools), this prevents the tool list from consuming large portions of the context budget.
augment=TrueThe augment parameter controls whether the tool’s description is stored in memory for semantic retrieval. Set it to True for tools that should be discoverable by the agent based on task relevance. Set it to False for tools that should always be available regardless of the current task (like a get_time utility).
AuthenticationError from Tavily — Your API key is incorrect or not set. Verify the TAVILY_API_KEY environment variable is configured in your Codespace repo settings.
AssertionError: TAVILY_API_KEY not set — The environment variable is missing. Check your Codespace repo settings and rebuild if needed.
Empty results — Tavily free tier has rate limits. Wait a moment and retry.