Web Tools

Rabbithole — open-source Rust tool for on-the-fly LLM website generation.
Source: github.com/ajbt200128/rabbithole  |  Live demo: isarabbithole.com  |  This page: /docs/web-tools.html

Contents

1. Overview

Web tools give the LLM the ability to perform real-time internet research during page generation. They are enabled by default. When active, the model may issue tool calls that the Rabbithole server intercepts, executes server-side, and feeds back as additional context before the model produces its final HTML output.

The model never directly accesses the internet. All HTTP requests are made by the Rust server using the reqwest crate. The model only receives structured results returned by the server.

Two tools are available:

Tool name Purpose Input Output
web_search Search the web; get titles, URLs, and snippets query (string) Structured list: title, URL, snippet per result
web_fetch Fetch and read a URL as plain text url (string) Plain text body, truncated to 50,000 characters

2. The Two Tools

web_search issues a web search query and returns structured results containing the page title, URL, and a snippet of relevant content for each result. The model can use these results to:

The search is performed server-side. The model receives a JSON-like array of result objects. It does not see raw HTML or the full content of any result page — only the snippet. To read a full page, the model must follow up with web_fetch.

2b. web_fetch

web_fetch takes a URL, fetches it via reqwest, strips all HTML tags to produce plain text, then truncates the result at 50,000 characters to fit within the model's context window. The plain text is returned to the model as a tool result.

Typical uses:

Note: The 50,000 character truncation is intentional. Very large pages (e.g., long Wikipedia articles or large source files) are cut off at that limit. The model only receives the first 50k characters. If a needed section appears late in a document, consider whether web_search with a more targeted query might be more effective.

3. The Multi-Turn Tool Loop

Web tool use is implemented as a multi-turn conversation loop inside the Rabbithole server. The server runs up to 10 rounds per page generation. Each round follows this sequence:

  1. The model produces a response. If the response contains a tool call, proceed to step 2. If the response is plain text (the final HTML), exit the loop.
  2. The server parses the tool call (tool name + arguments).
  3. The server executes the tool call (performs the HTTP request via reqwest).
  4. The result is appended to the conversation as a user message containing the tool output.
  5. The model is called again with the updated conversation. Return to step 1.

The loop exits when either: (a) the model produces a final text response with no further tool calls, or (b) the 10-round maximum is reached, at which point the server uses whatever text the model last produced.

In practice most pages use between 0 and 5 tool call rounds. The 10-round cap prevents runaway loops in pathological cases.

Round 1:  model → [tool_call: web_search("rust reqwest HTTP client")]
          server executes → returns 10 search results
          server appends result as user message

Round 2:  model → [tool_call: web_fetch("https://docs.rs/reqwest/latest/reqwest/")]
          server executes → returns plain text, truncated at 50k chars
          server appends result as user message

Round 3:  model → <!DOCTYPE html>...</html>    ← final output, loop exits

4. Disabling Web Tools

Pass the --no-web-tools flag to disable both web_search and web_fetch entirely:

rabbithole --no-web-tools

When this flag is set, the server does not register either tool with the model. The model generates all pages purely from its training knowledge and the prompt. No HTTP requests are made during generation.

This is a global flag — it cannot be toggled per-page or per-request. To enable tools for some pages and not others, you would need to run two separate server instances.

5. When to Use vs. Disable

Scenario Recommendation Reason
Documentation site (e.g., this site) Leave enabled Real API docs, version numbers, links, and facts improve accuracy
News or current-events site Leave enabled Model knowledge cutoff makes recent facts unreliable without search
Fictional / creative site (e.g., ACAPA demo, CGPA demo) Consider disabling No real-world facts needed; disabling reduces latency and cost
Static portfolio or marketing site Either Depends on whether you want to pull in real external content
High-throughput / low-latency deployment Disable Each tool call round adds 2–5 seconds; disabling can halve total latency
Restricted network environment Disable Avoids outbound HTTP calls from the server process entirely

6. Performance & Cost

Each tool call round has two components of overhead:

Latency
Each round requires a full round-trip to the LLM API plus the time to execute the tool (the HTTP request). In practice, each round adds roughly 2–5 seconds. A page that makes 3 rounds of tool calls might take 15–20 seconds total to generate, versus 5–8 seconds with no tool calls.
Token cost
Tool results are fed back into the conversation as additional tokens. A large web_fetch result (up to 50,000 characters ≈ ~12,000–15,000 tokens) will significantly increase the token count for that generation and thus the API cost. web_search results are much smaller (typically a few hundred tokens per search call).
Cost warning: If a page generation hits the 10-round limit and each round includes a large web_fetch, the token count for that request could become very large. Monitor costs when running in production with large numbers of page requests.

Rough estimates (varies by model and provider):

Configuration Approx. time per page Approx. token count
No tool calls (or --no-web-tools) 5–8 sec 2,000–5,000
1–2 web_search calls 8–12 sec 3,000–7,000
1 web_fetch + 1–2 web_search 12–18 sec 10,000–20,000
3+ rounds mixed 15–25 sec 15,000–40,000+

7. Example Tool Call Exchange

Below is a simplified representation of the message exchange that occurs inside the Rabbithole server during a single page generation that uses web tools. This is what the server constructs and passes to the LLM API internally — it is not user-facing.

// ── Initial request ────────────────────────────────────────────────────────
{
  "role": "user",
  "content": "Generate an HTML page about the reqwest Rust crate. ..."
}

// ── Round 1: model requests a tool call ────────────────────────────────────
{
  "role": "assistant",
  "content": null,
  "tool_calls": [{
    "id": "call_abc123",
    "type": "function",
    "function": {
      "name": "web_search",
      "arguments": "{\"query\": \"reqwest Rust HTTP client crate docs\"}"
    }
  }]
}

// ── Server executes web_search, returns results ───────────────────────────
{
  "role": "tool",
  "tool_call_id": "call_abc123",
  "content": "[{\"title\": \"seanmonstar/reqwest\", \"url\": \"https://github.com/seanmonstar/reqwest\",
               \"snippet\": \"An ergonomic, batteries-included HTTP Client for Rust.\"},
              {\"title\": \"reqwest - docs.rs\", \"url\": \"https://docs.rs/reqwest\",
               \"snippet\": \"reqwest 0.12.x — async and blocking HTTP client for Rust\"},
              ...]"
}

// ── Round 2: model requests another tool call ─────────────────────────────
{
  "role": "assistant",
  "tool_calls": [{
    "id": "call_def456",
    "function": {
      "name": "web_fetch",
      "arguments": "{\"url\": \"https://docs.rs/reqwest/latest/reqwest/\"}"
    }
  }]
}

// ── Server fetches URL, strips HTML, truncates at 50,000 chars ───────────
{
  "role": "tool",
  "tool_call_id": "call_def456",
  "content": "reqwest\nAn ergonomic, batteries-included HTTP Client for Rust.\n\nFeatures\n- async/await\n- TLS\n- ..."
  // (truncated at 50,000 chars)
}

// ── Round 3: model produces final output ──────────────────────────────────
{
  "role": "assistant",
  "content": "<!DOCTYPE html>\n<html>\n..."
}
// Loop exits. Server returns HTML to the cache and serves it.

8. Implementation Notes

A few details about how the tools are implemented server-side in Rust:

See Also