Turbo Mode
Context calls are cached in Redis per scopeId + task pair. Task matching is normalized for small casing, quote, punctuation, and whitespace differences, but different intent still creates a new cache entry. Hits are optimized for low latency (often under 20ms p50). Misses run the full retrieval path (on the order of ~200ms) and populate the cache — so marketing comparisons should pair “fast” with cached /v1/context, not with every retrieval path or with self-hosted vector infrastructure you still have to run.
Modes
| Mode | Behavior | P50 latency |
|---|---|---|
| fast (default) | Read cache first. Miss → compute + store | <20ms cached / ~200ms miss |
| fresh | Always bypass cache. Full retrieval every time. | ~200ms |
How to use
Pass mode as a query parameter or header:
# query parameter
POST /v1/context?mode=fast
POST /v1/context?mode=fresh
# or as a header
X-Memcone-Mode: fast
X-Memcone-Mode: fresh# query parameter
POST /v1/context?mode=fast
POST /v1/context?mode=fresh
# or as a header
X-Memcone-Mode: fast
X-Memcone-Mode: fresh// fast (default)
const res = await fetch('https://api.memcone.com/v1/context', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ scopeId, task }),
})
// fresh — bypasses cache
const res = await fetch('https://api.memcone.com/v1/context?mode=fresh', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ scopeId, task }),
})// fast (default)
const res = await fetch('https://api.memcone.com/v1/context', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ scopeId, task }),
})
// fresh — bypasses cache
const res = await fetch('https://api.memcone.com/v1/context?mode=fresh', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ scopeId, task }),
})Cache invalidation
The cache is invalidated automatically when remember changes memory for the same scope. You never need to manually clear it.
POST /v1/remember { scopeId: "user_123", ... }
→ if new memory was stored or an active belief changed, Memcone bumps the version counter for "user_123"
→ all prior cache entries for "user_123" are immediately stale
→ next /v1/context call recomputes and caches fresh resultPOST /v1/remember { scopeId: "user_123", ... }
→ if new memory was stored or an active belief changed, Memcone bumps the version counter for "user_123"
→ all prior cache entries for "user_123" are immediately stale
→ next /v1/context call recomputes and caches fresh resultReading cache status
Every /v1/context response includes:
{
"result": "...",
"cache_hit": true,
"tokens_saved": 84
}{
"result": "...",
"cache_hit": true,
"tokens_saved": 84
}Response headers also expose it:
X-Memcone-Cache: HIT
X-Memcone-Latency-Ms: 14
X-Memcone-Tokens-Saved: 84X-Memcone-Cache: HIT
X-Memcone-Latency-Ms: 14
X-Memcone-Tokens-Saved: 84When to use fresh
- After a series of
remembercalls when you want the most current state immediately - In background jobs where latency doesn't matter
- When debugging — fresh always shows the live state of memory