Billing

Memcone uses a compute unit model — you pay for work done, not for storage or number of memories.

Trial

Every new account gets 200 compute units to try Memcone — no credit card required. At typical usage patterns, that covers 50–200 API calls depending on call type.

Pricing

After the trial: $0.80 per 1,000 compute units.

| Call | Units | Notes | |---|---|---| | POST /v1/context (cache hit) | 1 | Served from Redis — fast and cheap | | POST /v1/context (cache miss) | 3 | Full retrieval + cache population | | POST /v1/context?mode=fresh | 5 | Bypasses cache, always recomputes | | POST /v1/remember | 3 | Extraction, embedding, contradiction check | | POST /v1/recall | 1 | Semantic search, no caching |

Estimating your bill

A typical SaaS AI app with 1,000 active users sending 10 messages/day:

10,000 /v1/remember calls × 3 units = 30,000 units
10,000 /v1/context calls at 70% hit rate:
- 7,000 hits × 1 unit = 7,000
- 3,000 misses × 3 units = 9,000
- = 16,000 units
Total: ~46,000 units/day

That's 1,380,000 units/month. At $0.80/1k = ~$1,104/month for 1,000 daily active users, or ~$1.10/user/month.

Reducing costs

The single biggest lever is cache hit rate. Every cache hit costs 1 unit instead of 3–5.

Cache hit rate improves when:

Users return to the same scopeId across sessions (the cache is persistent)
The task string is consistent for similar calls (small punctuation and casing changes are normalized, but different intent still creates a new cache key)
You call remember in batches rather than per-message where possible

Check your current hit rate with GET /v1/usage.

Billing cycle

Monthly billing
Usage tracked from account creation date
Trial units are one-time, not monthly
View current usage in the dashboard

Paying

Upgrade your plan in the billing dashboard. Payments are processed by Stripe.

MCP rate limits

MCP tool traffic is metered separately from compute units: 60 requests per minute per API key across all MCP tools. See MCP docs and the rate limit callout on API Keys.

MCP remember, recall, and context still consume compute units on the same meter as direct POST /v1/* calls. Free-tier limits use that live total at request time.

PreviousZed