Memory Modes: EXTRACTED vs VERBATIM vs HYBRID

Choose how HippoDid processes and stores memories for each character.

Table of contents
  1. Overview
  2. EXTRACTED (default)
  3. VERBATIM
  4. HYBRID
  5. Comparison table
  6. Setting the memory mode
    1. At character creation
    2. Updating an existing character
    3. Python SDK
    4. Via MCP tools
  7. Choosing the right mode
  8. Storage cost comparison
  9. Next steps

Overview

Every HippoDid character has a memory mode that controls how add_memory processes incoming content. The mode is set when the character is created and can be changed at any time.

Mode Processing Best for
EXTRACTED AI categorizes and extracts structured facts Most use cases (default)
VERBATIM Stores exact text as-is, no AI processing Compliance, legal, exact quotes
HYBRID AI extraction + verbatim archive Maximum fidelity when budget allows

EXTRACTED (default)

When you call add_memory with content like:

“The client prefers to be contacted by email, not phone. They use Slack for internal comms and have a renewal date of March 2027.”

EXTRACTED mode runs the content through HippoDid’s AI pipeline, which:

  1. Splits the text into individual facts
  2. Assigns each fact a category (e.g., preferences, events)
  3. Scores salience (0.0 to 1.0)
  4. Detects and resolves conflicts with existing memories
  5. Generates vector embeddings for semantic search

Result: three separate memories stored:

Category Content Salience
preferences Client prefers email contact, not phone 0.75
preferences Client uses Slack for internal communication 0.60
events Client renewal date: March 2027 0.85

When to use: Most of the time. EXTRACTED gives the best search results, the cleanest memory organization, and automatic deduplication.


VERBATIM

VERBATIM mode stores the exact text you provide with no AI processing. The content is saved as a single memory entry with the category you specify (or uncategorized if you do not).

Using the same input:

“The client prefers to be contacted by email, not phone. They use Slack for internal comms and have a renewal date of March 2027.”

Result: one memory stored:

Category Content Salience
uncategorized The client prefers to be contacted by email, not phone. They use Slack for internal comms and have a renewal date of March 2027. 0.50

When to use:

  • Compliance and legal: you need an exact record of what was said, with no AI paraphrasing
  • Audit trails: regulators require the original text, not a summary
  • Exact quotes: customer quotes, verbatim feedback, specific instructions
  • Cost control: no AI processing means no AI operation costs

Trade-offs:

  • No automatic categorization or fact splitting
  • No conflict detection or deduplication
  • Search relies on keyword overlap rather than semantic understanding
  • You get back exactly what you put in, nothing more

HYBRID

HYBRID mode does both: it runs the full EXTRACTED pipeline and also archives the original verbatim text. You get the structured, searchable facts plus an unmodified copy.

Using the same input, HYBRID produces:

Category Content Salience Type
preferences Client prefers email contact, not phone 0.75 extracted
preferences Client uses Slack for internal communication 0.60 extracted
events Client renewal date: March 2027 0.85 extracted
verbatim_archive The client prefers to be contacted by email, not phone. They use Slack for internal comms and have a renewal date of March 2027. 0.50 verbatim

When to use:

  • Regulated industries: you need AI-powered search and retrieval but also need to keep exact originals for audit
  • Session transcripts: store the structured facts for the agent and the raw transcript for compliance
  • High-value interactions: customer success calls, legal consultations, medical notes

Trade-offs:

  • Uses the most storage (roughly 2x compared to EXTRACTED alone)
  • Costs one AI operation per add_memory call (same as EXTRACTED)
  • The verbatim archive is searchable but not as well-organized as the extracted facts

Comparison table

  EXTRACTED VERBATIM HYBRID
AI processing Yes No Yes
Fact splitting Yes No Yes
Auto-categorization Yes No Yes
Conflict detection Yes No Yes
Semantic search quality Best Basic Best
Exact original preserved No Yes Yes
Storage per input Low Lowest Highest
AI operation cost 1 op 0 ops 1 op
Compliance-ready Partial Full Full

Setting the memory mode

At character creation

curl -X POST https://api.hippodid.com/v1/characters \
  -H "Authorization: Bearer hd_key_..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Legal Client - Smith",
    "categoryPreset": "standard",
    "memoryMode": "VERBATIM"
  }'

Updating an existing character

curl -X PATCH https://api.hippodid.com/v1/characters/CHARACTER_ID \
  -H "Authorization: Bearer hd_key_..." \
  -H "Content-Type: application/json" \
  -d '{
    "memoryMode": "HYBRID"
  }'

Python SDK

from hippodid import HippoDid

client = HippoDid(api_key="hd_key_...")

# Create the character first
character = client.create_character(
    name="Legal Client - Smith",
)

# Then set memory mode (must be set after creation, not at create time)
client.set_memory_mode(character.id, "VERBATIM")

# Switch to HYBRID later
client.update_character(
    character_id=character.id,
    memory_mode="HYBRID",
)

Via MCP tools

If you are using HippoDid through Claude Code, Cursor, or another MCP client:

Use hippodid to set the memory mode of "Legal Client - Smith" to HYBRID

The MCP set_memory_mode tool accepts EXTRACTED, VERBATIM, or HYBRID.


Choosing the right mode

Start with EXTRACTED (the default). It gives the best search results and keeps memory organized automatically. Switch only when you have a specific reason.

Use VERBATIM when:

  • You are in a regulated industry (healthcare, legal, finance) and auditors need exact records
  • You want zero AI processing cost
  • You are storing structured data that should not be paraphrased (JSON, code snippets, exact quotes)

Use HYBRID when:

  • You need both: great search results from extraction and exact originals for compliance
  • You are logging high-value customer interactions where both the structured facts and the raw transcript matter
  • Budget is not a constraint and you want maximum fidelity

Storage cost comparison

Assuming 1,000 memory inputs per month, each averaging 200 words:

Mode Memories stored AI ops Relative storage
EXTRACTED ~3,000 (after fact splitting) 1,000 1x
VERBATIM 1,000 (one per input) 0 0.7x
HYBRID ~4,000 (extracted + verbatim) 1,000 1.7x

Exact numbers depend on the density of facts in your inputs. Inputs with many distinct facts produce more extracted memories.


Next steps


Copyright © 2026 SameThoughts. HippoDid is proprietary software. Open-source components (Spring Boot Starter, MCP Server) are Apache 2.0.

This site uses Just the Docs, a documentation theme for Jekyll.