Rabbithole — open-source Rust tool for on-the-fly LLM website generation.
Source: github.com/ajbt200128/rabbithole |
Live demo: isarabbithole.com |
This page: /docs/web-tools.html
Web tools give the LLM the ability to perform real-time internet research during page generation. They are enabled by default. When active, the model may issue tool calls that the Rabbithole server intercepts, executes server-side, and feeds back as additional context before the model produces its final HTML output.
The model never directly accesses the internet. All HTTP requests are made by the Rust server
using the reqwest crate. The model
only receives structured results returned by the server.
Two tools are available:
| Tool name | Purpose | Input | Output |
|---|---|---|---|
web_search |
Search the web; get titles, URLs, and snippets | query (string) |
Structured list: title, URL, snippet per result |
web_fetch |
Fetch and read a URL as plain text | url (string) |
Plain text body, truncated to 50,000 characters |
web_search issues a web search query and returns structured results containing the
page title, URL, and a snippet of relevant content for each result. The model can use these
results to:
<img src="...">)
The search is performed server-side. The model receives a JSON-like array of result objects.
It does not see raw HTML or the full content of any result page — only the snippet.
To read a full page, the model must follow up with web_fetch.
web_fetch takes a URL, fetches it via reqwest, strips all HTML tags to
produce plain text, then truncates the result at 50,000 characters to fit within
the model's context window. The plain text is returned to the model as a tool result.
Typical uses:
web_searchweb_search with a more targeted query might be more effective.
Web tool use is implemented as a multi-turn conversation loop inside the Rabbithole server. The server runs up to 10 rounds per page generation. Each round follows this sequence:
reqwest).The loop exits when either: (a) the model produces a final text response with no further tool calls, or (b) the 10-round maximum is reached, at which point the server uses whatever text the model last produced.
In practice most pages use between 0 and 5 tool call rounds. The 10-round cap prevents runaway loops in pathological cases.
Round 1: model → [tool_call: web_search("rust reqwest HTTP client")]
server executes → returns 10 search results
server appends result as user message
Round 2: model → [tool_call: web_fetch("https://docs.rs/reqwest/latest/reqwest/")]
server executes → returns plain text, truncated at 50k chars
server appends result as user message
Round 3: model → <!DOCTYPE html>...</html> ← final output, loop exits
Pass the --no-web-tools flag to disable both web_search and
web_fetch entirely:
rabbithole --no-web-tools
When this flag is set, the server does not register either tool with the model. The model generates all pages purely from its training knowledge and the prompt. No HTTP requests are made during generation.
This is a global flag — it cannot be toggled per-page or per-request. To enable tools for some pages and not others, you would need to run two separate server instances.
| Scenario | Recommendation | Reason |
|---|---|---|
| Documentation site (e.g., this site) | Leave enabled | Real API docs, version numbers, links, and facts improve accuracy |
| News or current-events site | Leave enabled | Model knowledge cutoff makes recent facts unreliable without search |
| Fictional / creative site (e.g., ACAPA demo, CGPA demo) | Consider disabling | No real-world facts needed; disabling reduces latency and cost |
| Static portfolio or marketing site | Either | Depends on whether you want to pull in real external content |
| High-throughput / low-latency deployment | Disable | Each tool call round adds 2–5 seconds; disabling can halve total latency |
| Restricted network environment | Disable | Avoids outbound HTTP calls from the server process entirely |
Each tool call round has two components of overhead:
web_fetch result (up to 50,000 characters ≈ ~12,000–15,000 tokens) will
significantly increase the token count for that generation and thus the API cost.
web_search results are much smaller (typically a few hundred tokens per
search call).
web_fetch, the token count for that request could become very
large. Monitor costs when running in production with large numbers of page requests.
Rough estimates (varies by model and provider):
| Configuration | Approx. time per page | Approx. token count |
|---|---|---|
No tool calls (or --no-web-tools) |
5–8 sec | 2,000–5,000 |
1–2 web_search calls |
8–12 sec | 3,000–7,000 |
1 web_fetch + 1–2 web_search |
12–18 sec | 10,000–20,000 |
| 3+ rounds mixed | 15–25 sec | 15,000–40,000+ |
Below is a simplified representation of the message exchange that occurs inside the Rabbithole server during a single page generation that uses web tools. This is what the server constructs and passes to the LLM API internally — it is not user-facing.
// ── Initial request ────────────────────────────────────────────────────────
{
"role": "user",
"content": "Generate an HTML page about the reqwest Rust crate. ..."
}
// ── Round 1: model requests a tool call ────────────────────────────────────
{
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"reqwest Rust HTTP client crate docs\"}"
}
}]
}
// ── Server executes web_search, returns results ───────────────────────────
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "[{\"title\": \"seanmonstar/reqwest\", \"url\": \"https://github.com/seanmonstar/reqwest\",
\"snippet\": \"An ergonomic, batteries-included HTTP Client for Rust.\"},
{\"title\": \"reqwest - docs.rs\", \"url\": \"https://docs.rs/reqwest\",
\"snippet\": \"reqwest 0.12.x — async and blocking HTTP client for Rust\"},
...]"
}
// ── Round 2: model requests another tool call ─────────────────────────────
{
"role": "assistant",
"tool_calls": [{
"id": "call_def456",
"function": {
"name": "web_fetch",
"arguments": "{\"url\": \"https://docs.rs/reqwest/latest/reqwest/\"}"
}
}]
}
// ── Server fetches URL, strips HTML, truncates at 50,000 chars ───────────
{
"role": "tool",
"tool_call_id": "call_def456",
"content": "reqwest\nAn ergonomic, batteries-included HTTP Client for Rust.\n\nFeatures\n- async/await\n- TLS\n- ..."
// (truncated at 50,000 chars)
}
// ── Round 3: model produces final output ──────────────────────────────────
{
"role": "assistant",
"content": "<!DOCTYPE html>\n<html>\n..."
}
// Loop exits. Server returns HTML to the cache and serves it.
A few details about how the tools are implemented server-side in Rust:
reqwest crate,
which is the most popular async HTTP client
for Rust. It is built on top of hyper and supports TLS, connection
pooling, and async/await via Tokio.
<script>,
<style>, attributes, etc.) into the model's context, which would waste
tokens and reduce signal quality.
web_fetch is truncated
at 50,000 characters. This is a hard limit applied before the content is appended to the
conversation.
web_fetch or
web_search makes fresh HTTP requests.
tool_calls response field and dispatches to the appropriate Rust function.
--no-web-tools