Honest warning: Rabbithole is not a cost-effective platform for high-traffic sites. Every uncached page view triggers a live LLM API call. At scale, this gets expensive fast. Read this page before deploying to production.
Contents

How Billing Works

Rabbithole generates pages by calling the Anthropic Claude API. Each page generation is a single API call. You are billed by Anthropic directly based on token consumption — both input (the system prompt + page concept) and output (the generated HTML).

You pay per uncached page visit. The first visitor to any URL triggers an API call and pays the cost. Subsequent visitors to the same URL are served from the SQLite cache for free. There is no Rabbithole-specific billing layer; it is a pass-through to Anthropic's token pricing.

A typical page generation involves:

See the Architecture page for a detailed breakdown of what goes into each API call.

Model Tiers & Per-Page Cost

You configure which Claude model Rabbithole uses in config.toml. The choice of model is the single biggest lever on cost. Anthropic offers three recommended tiers in 2026: Haiku 4.5 ($1/$5), Sonnet 4.6 ($3/$15), and Opus 4.6 ($5/$25) per million input/output tokens.

Model Input (per 1M tokens) Output (per 1M tokens) Est. cost / page (no tools) Est. cost / page (with tools) Quality
claude-haiku-4-5 $1.00 $5.00 ~$0.02–$0.05 ~$0.06–$0.15 Fast; simpler pages; occasional errors in complex layouts
claude-sonnet-4-6 $3.00 $15.00 ~$0.05–$0.12 ~$0.15–$0.40 Good balance; recommended default for most sites
claude-opus-4-6 $5.00 $25.00 ~$0.08–$0.20 ~$0.25–$0.60 Best quality; richest output; highest cost

These are estimates. Actual cost depends on the length of the system prompt, the complexity of the page concept, and how much HTML the model generates. A sparse, minimalist page costs less than a rich data-heavy one.

Token math example (Sonnet 4.6, no tools):
Input: 2,000 tokens × $3.00 / 1,000,000 = $0.006
Output: 5,000 tokens × $15.00 / 1,000,000 = $0.075
Total: ~$0.08 per page generation

Anthropic API pricing ranges from $0.25/$1.25 per million tokens (Haiku) to $15/$75 per million tokens (older Opus generations). The newer model families are generally more efficient. Don't default to Opus for routine workloads — costs can be up to 100x higher than Haiku.

Caching Behavior

Rabbithole uses SQLite for page caching. Once a URL is generated, the resulting HTML is stored permanently. Subsequent requests for the same URL are served directly from the database with zero API calls and zero cost.

Visit Type API Call? Cost Latency
First visit (cache miss) Yes $0.02–$0.60 depending on model & tools 3–30 seconds (LLM generation time)
Repeat visit (cache hit) No $0.00 <10ms (SQLite read)
After cache clear / invalidation Yes $0.02–$0.60 3–30 seconds

Implications:

Note: Rabbithole's SQLite cache is separate from Anthropic's own prompt caching feature (which reduces cost for reused system prompts within API calls). Prompt caching can be configured to additionally reduce costs on the Anthropic side. Prompt caching reduces costs and latency by reusing previously processed portions of your prompt across API calls, reading from cache at a fraction of the standard input price instead of reprocessing the same large system prompt on every request.

Web Tools Cost Multiplier

When web tools (web search and web fetch) are enabled, each page generation may involve multiple additional round-trips with the API. Each tool invocation causes the model to:

  1. Output a tool call request (tokens charged)
  2. Receive the tool result (injected as input tokens)
  3. Continue generating (more output tokens)

A typical page with 2 web searches + 1 web fetch will add approximately 3,000–8,000 additional input tokens (search results can be verbose) and extend generation time significantly.

Tool Usage Additional Input Tokens (est.) Cost Impact (Sonnet 4.6)
No tools 0 baseline
1 web search +2,000–4,000 +$0.006–$0.012
2 web searches + 1 fetch +6,000–15,000 +$0.018–$0.045
3+ searches + multiple fetches +15,000–40,000 +$0.045–$0.12

The combined cost of an Opus-class model with aggressive tool use can reach $0.40–$0.60 per page. This page you are reading likely cost approximately that amount to generate.

Tip: Disable web tools (web_tools = false in config) for sites that don't need real-time information. Static content, fiction, documentation, and other non-news sites rarely benefit enough to justify the added cost.

Cost by Traffic Scenario

The following table models realistic deployment scenarios. These use Sonnet 4.6 without tools as the baseline. Adjust proportionally for other models.

Scenario Unique Pages Visitors/day Cache hit rate API calls/day Est. daily cost Est. monthly cost
Personal blog, small ~50 10–50 ~99% 0–2 $0.00–$0.16 ~$1–$5
Demo / portfolio site ~100 50–200 ~98% 1–4 $0.08–$0.48 ~$3–$15
Reference site (pre-generated) ~500 200–1,000 ~99.5% 1–5 $0.08–$0.60 ~$3–$18
Medium traffic site, many URLs ~5,000 1,000–5,000 ~90% 100–500 $8–$60 $240–$1,800
High traffic, dynamic URLs Unbounded 10,000+ <50% 5,000+ $400+ $12,000+
Reality check: A site with even modest traffic and a large URL space (e.g., generated path parameters, user-specific pages, infinite scroll, search result pages) can run up hundreds of dollars per day. Rabbithole has no built-in rate limiting or spending cap by default. You can burn through your Anthropic API budget in hours if you don't constrain your URL space.

Cost Control Strategies

1. Use the cheapest model that meets your quality bar

Use Haiku for lightweight, high-volume workloads; Sonnet or Opus only when advanced reasoning or coding is required. For most Rabbithole use cases — generating informational pages, blog posts, documentation — Haiku 4.5 produces acceptable output at roughly 1/5th the cost of Sonnet. Test Haiku first before upgrading.

2. Disable web tools unless necessary

Set web_tools = false in config.toml. Web tool calls are the largest per-page cost multiplier. If your site generates fictional content, static documentation, or creative writing, there is no need for live web searches.

3. Pre-generate important pages at deploy time

Use the rabbithole crawl CLI command to pre-warm your cache before going live. Walk your link graph starting from the homepage and generate all reachable pages. When real users arrive, every page will be a cache hit at $0.00. See Deployment for a pre-generation script.

4. Constrain your URL space

Every distinct URL Rabbithole has never seen before is a billable event. Do not allow user-controlled URL parameters to reach the Rabbithole handler. Use a reverse proxy (nginx, Caddy) to block arbitrary paths, limit URL patterns, and prevent URL enumeration attacks that could generate thousands of API calls.

5. Set Anthropic spend limits

In your Anthropic Console, configure a monthly budget cap and configure alerts. Rabbithole itself does not enforce spending limits. A hard cap at Anthropic's side is your safety net. Do not deploy without one.

6. Shorten system prompts

Trim unnecessary context, system instructions, or verbose text — every token adds cost. Review the default system prompt in src/prompt.rs and strip anything that doesn't add value for your specific use case. A 500-token reduction in the system prompt, multiplied across thousands of page generations, adds up quickly.

7. Use the Batch API for pre-generation

Combining prompt caching and the batch API can reduce costs by up to 95%. The Anthropic Batch API offers a 50% discount on all token pricing for asynchronous workloads. If you are pre-generating a large site offline (not serving live requests), batching saves significant money. Batch support is not built into Rabbithole's live server mode but can be used in scripted crawls.

8. Cache aggressively, back up your database

Your SQLite database is your primary cost-saving asset. Back it up regularly. Do not run DELETE FROM pages in production unless you are prepared to pay regeneration costs. Consider mounting rabbithole.db on persistent cloud storage.

Tip: Many teams use Haiku for lightweight queries and Sonnet/Opus for more complex reasoning — balancing cost and performance. In Rabbithole's context, you could route simple pages (indexes, stubs) to Haiku and complex, tool-enriched pages (articles, data pages) to Sonnet. This per-route model selection is not built in but can be implemented by running multiple Rabbithole instances with different configs behind a reverse proxy.

When NOT to Use Rabbithole

To be direct: Rabbithole is a novelty and a research tool. It is not a production CMS for high-traffic sites. The following use cases are poor fits:

Good fits: personal experiments, demos, low-traffic hobbyist sites, AI research projects, small creative projects where the novelty of live generation is the point.

Monitoring Your Spend

Rabbithole logs each API call to stdout including token counts and estimated cost (if log_cost = true in config). For aggregate tracking:

# Example: query your cache to estimate remaining generation surface area
$ sqlite3 rabbithole.db "SELECT COUNT(*) FROM pages;"
142

# Compare to total mapped URLs to estimate % cached
$ sqlite3 rabbithole.db "SELECT COUNT(*) FROM url_mappings;"
387

# ~37% of mapped URLs have been generated (= potential cost exposure)

See also: Configuration Reference, Deployment Guide, Architecture Overview.