Browse docs

Core Concepts

Tap to expand

Getting Started

Core Concepts

API

Auth1 page

API Auth Caller Model

Memory5 pages

Index API1 page

Index API

Context and Sources2 pages

Search and Operations2 pages

SDK

Quickstart1 page

SDK Quickstart

Scoping1 page

User and Session Scoping

Modules4 pages

Adapters2 pages

Migration1 page

Migration: RetainDBClient to RetainDB

MCP

Setup3 pages

Primary Tools1 page

Semantic Search Tools

Security and Scope1 page

Security and Scope Controls

Integrations

Frameworks4 pages

Agent Hosts2 pages

Connectors

Web5 pages

Knowledge Bases6 pages

Structured Sources4 pages

Packages and Research4 pages

Dashboard

Overview2 pages

Sources2 pages

Workflows3 pages

Developer1 page

Dev: Keys, SDK, and MCP

Tutorials

Migrations

Operations

Legacy

Legacy Documentation

Contribute

Contributing

ConceptsUpdated 2026-03-18

Extraction Reliability

Learn how RetainDB keeps async extraction usable in production and what signals you should trust when you are validating a new integration.

Extraction reliability is the part of RetainDB that determines whether a fast write turns into usable memory instead of an opaque background job.

For a first-time adopter, the important point is not every internal stage. It is knowing what behavior is intentional and what behavior means something is wrong.

The reliability model in plain English

RetainDB does not force every write to wait for full downstream processing.

Instead, the system tries to give you three things at once:

fast write acknowledgment
enough visibility to confirm the write is there
eventual fully processed memory for later retrieval

That tradeoff is why you will sometimes see a memory as pending before you see it as fully processed.

What happens after a write

The path usually looks like this:

your app submits memory or a session ingest request
RetainDB validates the request and accepts it
background extraction classifies, structures, and indexes the content
read surfaces merge pending and processed data when asked

The system is working as intended if a fresh write appears through the pending overlay and later settles into the normal processed state.

What you should verify first

When you are evaluating reliability, start with behavior that matters to your app:

does the write return a usable acknowledgement?
can you read it back in the same scope?
does the pending result converge to processed memory?
can you poll the job if the route is async?

This is more useful than chasing internal pipeline details too early.

The most common false alarms

“Search lost my write”

Usually the scope changed between write and read.

“The system is inconsistent”

Usually include_pending was disabled during a first-run test, so you are seeing processed-only behavior and assuming the write vanished.

“Extraction is broken”

Usually the write is still in progress and the job or pending overlay would tell you that if you checked the right endpoint.

Signals worth trusting

These are the signals that help most during debugging:

response trace ids on write and read calls
include_pending behavior on search, profile, and session reads
job polling through GET /v1/memory/jobs/:jobId
the exact project, user_id, and session_id used on both sides

Info

Reliability debugging is mostly correlation debugging. You are trying to prove that the same content moved through the same scope, not prove that every internal subsystem ran synchronously.

What a healthy rollout looks like

A healthy integration rollout usually follows this order:

validate one write and one read in a fixed project and user scope
confirm immediate visibility with include_pending=true
confirm processed visibility after background completion
only then add batching, connectors, or more complex retrieval paths

What teams get wrong when scaling up

they ingest large volumes before validating a single clean loop
they mix user and session semantics
they treat pending visibility as an error instead of a feature
they debug retrieval quality before debugging scope hygiene

Next step

If you want the concrete API behavior behind this model, read read-after-write visibility. If you are ready to test session ingestion directly, continue to session ingest and extraction.

Was this page helpful?

Your feedback helps us prioritize docs improvements weekly.