Web Tools
Giving the LLM access to the live internet during page generation.
Overview
By default, Rabbithole generates pages using only the LLM's internal knowledge and the prompt stored for that URL. This is fast and sufficient for many use cases — especially fictional or creative sites where consistency of invented lore matters more than factual accuracy.
However, for factual, technical, or news-oriented sites, you often want the generated
page to reflect real, current information: the actual contents of a GitHub
repository, today's package version numbers, a live article, or an up-to-date API
reference. The web tools feature addresses this by giving the LLM
access to two tools during page generation: web_search and
web_fetch.
When web tools are enabled, before writing HTML the LLM can issue tool calls to search the internet or fetch a specific URL. The results are incorporated into the generated page. The Rabbithole documentation site itself (isarabbithole.com) uses this feature so that pages can reference real source code, accurate CLI flags, and up-to-date project information from github.com/ajbt200128/rabbithole.
How It Works
Rabbithole uses an LLM API that supports tool use (also called function
calling). When the --web-tools flag is passed, two tool definitions are
included in the system prompt and API call for every page generation request.
The LLM can then respond with one or more tool calls before producing the final HTML. Rabbithole intercepts those calls, executes the searches or fetches on the LLM's behalf, returns the results back to the model, and the model continues generating until it produces the complete HTML response.
This is a standard agentic loop: the model can make multiple sequential tool calls, reasoning about what information it still needs before writing the page. Typically a page will make one to three tool calls before producing output.
Request arrives for /some-page.html
|
v
Rabbithole looks up stored prompt
|
v
LLM API call (with tool definitions included)
|
[tool call?]
/ \
yes no
| |
v v
Execute tool Write HTML response
(search/fetch)
|
Return result
to LLM
|
[loop back]
The Two Tools
web_search
web_search accepts a plain-text query string and returns a list of search
results including titles, URLs, and short excerpt snippets. The LLM uses this to
discover relevant pages, check current facts, look up package versions, or find
documentation for a topic it will be writing about.
| Parameter | Type | Description |
|---|---|---|
query |
string | The search query (1–6 words works best; keep it specific) |
Example tool call:
{
"tool": "web_search",
"parameters": {
"query": "rabbithole ajbt200128 rust LLM web server"
}
}
Example result (abbreviated):
[
{
"title": "ajbt200128/rabbithole — GitHub",
"url": "https://github.com/ajbt200128/rabbithole",
"snippet": "An open-source Rust web server that dynamically generates websites using LLMs."
},
...
]
web_fetch
web_fetch accepts a URL and returns the plain-text body of that page
(HTML tags stripped). The LLM uses this to read actual source files, README documents,
configuration references, API documentation, or any page whose full content it needs
to incorporate accurately.
| Parameter | Type | Description |
|---|---|---|
url |
string | The full URL to fetch |
Example tool call:
{
"tool": "web_fetch",
"parameters": {
"url": "https://raw.githubusercontent.com/ajbt200128/rabbithole/main/README.md"
}
}
The response is the raw text content of that document — which the LLM can then quote, summarize, or use as a source of truth when writing the page.
Enabling Web Tools
Web tools are disabled by default. Pass the --web-tools
flag when starting the server to enable them:
rabbithole --web-tools --seed "Homepage for a Rust documentation site" .
Or in a configuration file:
# rabbithole.toml web_tools = true seed = "Homepage for a Rust documentation site"
See the Configuration page for the full list of options and their precedence rules (CLI flags override config file values).
Tradeoffs
| Concern | Details |
|---|---|
| Slower generation | Each tool call adds a round-trip: the LLM issues a call, Rabbithole executes it, and the result is sent back before generation continues. A page that makes three tool calls may take 3–5× as long to generate as one that makes none. Because Rabbithole caches pages permanently after the first visit, this penalty is only paid once per URL. |
| Copyrighted content | If the LLM fetches or searches for creative or journalistic content (news articles, blog posts, literary works), it may reproduce copyrighted material in the generated page. For factual/technical sites this risk is low. For sites that might touch creative content, consider disabling web tools or adding a system prompt warning the LLM to paraphrase rather than quote extensively. |
| Inconsistency in fictional sites | Fictional sites (invented universities, fantasy worlds, alternate-history settings) rely on the LLM maintaining consistent invented facts across pages. Giving the model live internet access can cause it to "correct" invented facts with real-world information, breaking immersion. For example, a fictional college might have its invented founding year overridden by real Wikipedia data the LLM finds about a similarly named institution. |
| External dependency | Web tools introduce a runtime dependency on external search and fetch services. If those services are slow or unavailable, page generation will time out or degrade. Cache warm-up passes (see Deployment) are especially important when web tools are enabled. |
| Non-determinism | Search results change over time. Two visits to the same uncached URL on different days could produce different pages if search results have changed. Since Rabbithole caches permanently, this is only a concern for the first generation — but it means your cache reflects the state of the web at the time of first visit. |
When to Use (and When Not To)
- Technical documentation sites that should reference real source code (e.g., this site itself)
- News or current-events sites where information freshness is the whole point
- Reference pages that should quote real package versions, API signatures, or release notes
- Sites that link out to external resources and want to verify those resources exist
- Any site where you want the LLM to ground its claims in real, checkable sources
- Purely fictional sites — e.g., invented academic institutions like ACAPA (the Astrological College of Applied Practical Arts) or CGPA (the College of Goat Pastoralism and Administration). The LLM should invent, not research.
- Sites where generation speed is critical and you cannot afford the extra latency
- Sites whose content involves creative works where copyright risk is elevated
- Isolated or air-gapped deployments without external network access
The rule of thumb: if the page would be more accurate by looking things up, enable web tools. If the page would be less consistent by looking things up, disable them.
Before / After Examples
The following illustrates how the same prompt produces qualitatively different pages depending on whether web tools are enabled.
Example 1: Documentation page for a real library
Prompt: "Page documenting the --port flag of the Rabbithole CLI."
Without web tools
The --port flag sets the TCP port the server listens on. Default is likely 8080 or 3000 (exact value may vary by version).
With web tools (after fetching README)
The --port flag sets the TCP port.
Default: 3000. Example:
rabbithole --port 8080 \
--seed "My site" .
This is equivalent to setting
port = 8080 in rabbithole.toml.
Example 2: A page in a fictional university site
Prompt: "Department of Recursive Hermeneutics at ACAPA, course listings."
Without web tools
RH 101: Introduction to
Self-Referential Texts
RH 204: The Paradox Seminar
RH 310: Advanced Circular
Reasoning Lab
All courses held in the
Möbius Building, Room ∞.
With web tools (searched for "hermeneutics")
RH 101: Introduction to Hermeneutic Circles (Dilthey) RH 204: Gadamer and Tradition RH 310: Ricoeur's Narrative Theory [Consistent invented lore broken; real academic references intrude.]
The second example shows why fictional sites should disable web tools: the LLM "corrects" the invented curriculum with real hermeneutics scholarship, undermining the playful fictional premise.
Example 3: A current-events page
Prompt: "Latest Rust stable release notes."
Without web tools
Rust 1.75 introduced async fn in traits. (Note: this may be outdated; check blog.rust-lang.org for the current release.)
With web tools (after searching + fetching)
Rust 1.87.0 (released 2025-05-15) - Stabilized ... - Performance improvements to ... - See full notes at: blog.rust-lang.org/2025/05/15/...
See Also
- Configuration Reference — full list of flags including
--web-tools - Architecture — how the page generation loop works in detail
- Getting Started — quick start guide
- Examples — example sites with and without web tools
- Deployment — cache warm-up and production considerations
- GitHub: ajbt200128/rabbithole — source code
- Citation & Copyright Policy — guidelines for LLMs using web tools responsibly
- Advanced Web Tools Usage — custom tool configurations, rate limiting, and caching fetched content