← Blog

From RAG to Recall

RAG retrieves documents. Ember recalls experiences. Why the distinction matters for AI agents.

Robert Praul//7 min read

Every AI memory system built in the last two years follows the same pattern: embed, search, retrieve. It has a name: Retrieval-Augmented Generation. RAG.

RAG is great for knowledge. You have a corpus of documents, the user asks a question, you find the relevant chunks, and inject them as context. The agent answers with grounded information it wouldn't have otherwise.

But RAG is not memory.

This isn't pedantry. It's a design distinction that changes how agents behave, how they feel to interact with, and what kinds of applications you can build.

The Fundamental Difference

RAGEmber
TriggerExplicit queryInvoluntary convergence
Scoring axes1 (semantic similarity)7 (semantic + emotional + sensory + temporal + spatial + relational + musical)
Minimum to fireAny cosine match above threshold3+ dimensions must converge independently
OutputRanked list of text chunksIgnition with intensity tier + dimensional breakdown
Refractory periodNone (same doc can be retrieved repeatedly)24h per memory (no fixation)
Data modelDocuments / text chunksExperiences with multi-dimensional metadata
Feels likeA search engineA person remembering

Trigger Model: Query vs. Convergence

In RAG, retrieval requires intent:

# RAG: explicit retrieval
query = "What restaurants did we discuss?"
results = vector_store.similarity_search(query, k=5)
context = "\n".join([r.page_content for r in results])
# → Agent answers with retrieved context

The user (or system) decides when to search. The query determines what comes back. Memory is an explicit operation.

In Ember, recall is involuntary:

# Ember: involuntary recall
message = "It's raining pretty hard tonight"
result = ember.check(message)
# → No one asked for a memory
# → But rain + evening + contemplative tone converged
# → A childhood memory of thunderstorms ignites at warm intensity

Nobody asked for a memory. Nobody queried anything. The convergence of sensory + temporal + emotional signals caused a recall event that the agent didn't choose.

This is the distinction: RAG is a tool the agent uses. Ember is something that happens to the agent.

Scoring: One Axis vs. Seven

RAG scores on a single axis: semantic similarity. The query embedding is compared against document embeddings via cosine distance. Closer = more relevant.

Ember scores on 7 independent dimensions:

Semantic0.30
Emotional0.25
Sensory0.15
Relational0.10
Temporal0.10
Spatial0.05
Musical0.05

No single dimension can trigger recall. The semantic dimension (0.30 weight) is necessary but not sufficient. A memory only ignites when 3+ dimensions converge independently.

This means Ember is harder to trigger than RAG (which fires on any cosine match above threshold) but more precise when it does fire. A warm ignition means real multi-dimensional resonance between present context and past experience.

Output: Results vs. Experiences

RAG returns a ranked list of text chunks:

# RAG output
[
  {"text": "...", "score": 0.87, "metadata": {...}},
  {"text": "...", "score": 0.82, "metadata": {...}},
]

Ember returns an ignition result with intensity and dimensional breakdown:

# Ember output
IgnitionResult(
    fired=True,
    intensity="warm",         # faint / warm / vivid
    score=0.52,
    dimensions_fired=5,       # out of 7
    memory_text="Dawn patrol at El Porto...",
    dimension_scores={
        "semantic": 0.71,
        "emotional": 0.65,
        "sensory": 0.48,
        "temporal": 0.80,
        "spatial": 0.92,
        "relational": 0.0,
        "musical": 0.0,
    },
)

The dimensional breakdown tells you why a memory fired. Not just “this was relevant” but “the spatial proximity was 92%, the temporal alignment was 80%, and the emotional tone matched at 65%.”

When to Use What

Use RAG when:

  • You have a knowledge base the agent needs to query
  • Users ask explicit questions (“What does the docs say about X?”)
  • Accuracy of retrieval matters more than experiential quality
  • The data is documents, not lived experiences

Use Ember when:

  • Memory should surface without being asked for
  • The data is experiential — moments, feelings, relationships
  • You want the agent to feel like it remembers, not that it searched
  • Multi-dimensional context matters (emotion + place + time + people)
  • You're building companions, journaling tools, therapy bots, creative writing aids

Use both when:

  • Your agent needs knowledge retrieval and experiential memory
  • RAG handles the “what do I know?” and Ember handles the “what do I remember?”
  • The same conversation needs both factual grounding and emotional resonance

Architecture Comparison

RAG Pipeline:
  Query → Embed → Vector Search → Top-K → Inject → Generate

Ember Pipeline:
  Message → Signal Constellation (7 dims) → Vector Search
    → Multi-Dimensional Scoring → Ignition Gates
    → Refractory Check → Intensity Tier → Inject → Generate

The Ember pipeline has more stages, but each stage is fast. Signal extraction is lexicon-based (no LLM call). Vector search uses the same cosine similarity as RAG. Multi-dimensional scoring is simple arithmetic. The entire pipeline runs in under 50ms on CPU.

The Real Difference Is Behavioral

The technical differences are interesting, but the behavioral difference is what matters to users:

RAG-powered agent

User: “It's raining pretty hard tonight.”

Agent: “I see! Would you like me to check the weather forecast for tomorrow?”

No memory triggered. The agent responds to the surface content.

Ember-powered agent

User: “It's raining pretty hard tonight.”

[4 dimensions converge: semantic (weather) + sensory (rain/auditory) + temporal (evening) + emotional (contemplative)]

Agent: “Something about heavy rain at night. You ever just sit and listen to it? Reminds me of that Levittown thunder you talked about — the way the whole sky would open up and you'd stand on the porch just watching.”

The memory surfaced involuntarily. The response is colored by recall, not search.

One agent has a database. The other has a past.

Add Recall to Your Agent

pip install ember-experiences

Works alongside your existing RAG pipeline — GitHub — Apache 2.0