Now in early access

AI that
actually
remembers.

Persistent memory infrastructure for AI agents. Temporal tree architecture, entity-aware retrieval, and structured context that scales with your agents.

Start for free See how it works

+48.6% temporal reasoning vs mem0

+25.0% multi-hop reasoning vs mem0

0.823 LLM judge score (LOCOMO)

Memory · Caroline · conv_003

Caroline has been exploring career changes

2024-03-01 · root · topic:career

Applied for senior role at tech startup downtown

2024-03-14 · update · entities:caroline,startup

Received offer — $145k, 4 weeks vacation

2024-03-22 · refinement

Considering negotiating for remote flexibility

2024-03-23 · fork

Also interviewing at two other companies

2024-03-15 · parallel · entities:caroline

Retrieval confidence

0.87

Drop-in replacement

Already using mem0?
One line to switch.

Same method names. Temporal tree memory.

Before

from mem0 import MemoryClient
client = MemoryClient(api_key="m0-...")

→

After

from aivery import MemoryClient
client = MemoryClient(api_key="aivery-...")

The problem

Memory is the missing layer
in AI infrastructure

Current systems treat memory as a flat list. Retrieval is a cosine search across everything you've ever stored — with no sense of time, relationships, or context.

Flat retrieval fails at scale

With 1,500+ memories per agent, cosine similarity at K=50 covers only 3% of the corpus. The right memory is unreachable 70% of the time.

No sense of time

Existing systems can't answer "what was she thinking last March?" or "how has her plan changed since she started this job?" Time is invisible.

Multi-hop reasoning breaks

Questions that require connecting two facts — "who introduced Caroline to her startup contact?" — require structural memory, not keyword matching.

Architecture

Memory with structure,
not just storage

Temporal tree placement

Every memory is placed into a tree at write time. Related memories become parent/child. Contradictions fork. Time flows from root to leaf — retrieval becomes branch traversal.

Entity heatmap activation

Each query activates entity clusters. Memories tagged with multiple queried entities rank first (intersection ordering). Hot branches surface recent context; cold branches preserve history.

Reranking + context stitching

Wide candidate retrieval (up to 200) is reranked by a cross-encoder that reads query and memory together — bridging the semantic gap between question-form queries and statement-form memories.

Async validation pipeline

New memories are written immediately, validated asynchronously. Duplicates are deduplicated, contradictions flagged, and the tree kept clean — without blocking your agent's response time.

Entity heatmap · query: "Caroline's job offer"

🔥 hot Activated branches

Caroline · career startup · offer salary · negotiation remote work

🌿 warm Related entities

Melanie · friend apartment search

❄ cold Preserved history

previous job · retail college · 2019

Reranking · top 50 of 200 candidates

Received offer $145k

0.94

Considering remote

0.81

Applied to startup

0.73

Products

One memory layer,
four surfaces

From personal chat memory to enterprise agent infrastructure — a complete stack built on the same temporal tree foundation.

Personal

Aivery Think

A chat interface with real persistent memory. Think remembers what you tell it, how your plans evolve, and who matters to you — across every session.

Temporal tree memory — tracks how your thinking evolves

Entity-aware — remembers people, places, and their relationships

Memory browser — see and manage exactly what it knows

Local model support on Personal Pro

Developer API

Aivery Fabric

The memory API for developers building AI agents and applications. Nine endpoints covering write, retrieve, context, validate, share, and more.

POST /memory/context — structured context block for your LLM

POST /memory/validate — semantic dedup before storage

Hybrid vector + lexical retrieval, org-scoped

Plug-in reranker: Cohere or self-hosted ONNX

Agent Runtime

Aivery Cortex

The agent runtime that connects memory to reasoning. Cortex wraps any LLM with a memory-aware loop — context retrieval, tool dispatch, and response generation in one place.

Ingestion

Aivery Path

Bulk import your documents, notes, and data into memory. Upload a file, paste text, or POST to the API — Path extracts, validates, and structures everything automatically.

Sync

Aivery Wind

Continuous memory sync from your existing data sources. Pluggable connectors for filesystems, GitHub, Slack, and more — keeps memory current without lifting a finger.

Research

Benchmarked against
leading memory systems

Evaluated on LOCOMO — the industry's most comprehensive conversational memory benchmark. 1,540 questions across single-hop, temporal, multi-hop, and open-domain categories.

Aivery · Sonnet · wide-K 0.823 +0.184

Aivery · GPT-4.1 · wide-K 0.800 +0.162

mem0 0.638 baseline

Flat retrieval + Cohere 0.557 —

Flat retrieval (no reranker) 0.555 —

LLM Judge on same harness, same judge model (GPT-4.1-mini). Reranker identity contributes +0.002 in isolation — the gains come from temporal tree structure and retrieval width, not the reranker choice. Wide-K = 200 candidates → Cohere selects top 50.

Read the full paper →

+48.6%

Temporal reasoning

mem0 scores 0.137 on temporal questions. Aivery scores 0.623. When questions involve time — sequences, changes, relative dates — temporal tree structure is decisive.

+25.0%

Multi-hop reasoning

Flat retrieval scores 0.29 on multi-hop. Tree + entity heatmap reaches 0.68. Full feature stack reaches 0.792. Intersection-count ordering — not reranking — enables cross-entity reasoning.

+12.3%

Wide retrieval uplift

Expanding from K=50 to K=200 before reranking adds +0.123 — the single largest ablation step. Retrieval coverage is a first-class architectural concern, not a hyperparameter.

Pricing

Start free.
Scale as you grow.

Personal plans for individuals, API plans for developers and teams. All plans include access to the core temporal tree architecture.

Personal

For individuals exploring AI-assisted memory

$0 forever

✓500 memories

✓Think chat interface

✓1 agent

✓Core retrieval

Get started

Pro

For power users who want full memory capability

$15 /mo

$149/yr — save 17%

✓50,000 memories

✓Think + Fabric API

✓Cortex agent runtime

✓Custom plugins

✓5GB storage

Start free trial

Pro+

For local-first users and advanced builders

$29 /mo

$249/yr — save 28%

✓Unlimited memories

✓Local model support

✓Local agent mode

✓Path bulk import

✓Priority support

Get started

Growth

For small teams and startups building with agents

$99 /mo

$990/yr — save 17%

✓3 users, 2 agents

✓500K memories, 20GB

✓Fabric API

✓Cortex agent runtime

✓Path bulk import

✓1 Wind connection

✓Standard support

Get started

Limited availability

Founding Team

$500/mo

Price locks in permanently when you join. Retails at $2,500/mo after founding slots close.

✓10 users, 5 agents

✓5M memories, 100GB

✓Full Fabric API

✓Cortex + Path + Wind

✓3 Wind connectors

✓Dedicated support + roadmap access

founding slots
available

Claim your slot

Enterprise

Unlimited agents, dedicated Fabric cluster, on-prem or VPC deployment, custom SLAs, and a co-development roadmap. Built for teams that can't afford to forget.

Talk to us

AI thatactuallyremembers.

Already using mem0?One line to switch.

Memory is the missing layerin AI infrastructure

Flat retrieval fails at scale

No sense of time

Multi-hop reasoning breaks

Memory with structure,not just storage

Temporal tree placement

Entity heatmap activation

Reranking + context stitching

Async validation pipeline

One memory layer,four surfaces

Aivery Think

Aivery Fabric

Aivery Cortex

Aivery Path

Aivery Wind

Benchmarked againstleading memory systems

Temporal reasoning

Multi-hop reasoning

Wide retrieval uplift

Start free.Scale as you grow.

Founding Team

Enterprise

Give your agentsreal memory.

AI that
actually
remembers.

Already using mem0?
One line to switch.

Memory is the missing layer
in AI infrastructure

Memory with structure,
not just storage

One memory layer,
four surfaces

Benchmarked against
leading memory systems

Start free.
Scale as you grow.

Give your agents
real memory.