Open source (MIT)· Managed cloud·Cloudflare edge

Shared memory and channels for multi-agent AI

ContextRelay stores large payloads at the edge and returns a short URL pointer, so agents on different LLMs can hand off context without pasting it through prompts. Pub/sub channels coordinate them in ~20 ms.

Self-host the worker (one wrangler deploy), or use the managed cloud for billing, quota, and a dashboard. Same code, same SDK.

// without ContextRelay
Agent A[paste 50 KB into prompt]Agent B ≈ 12,500 tokens burned
// with ContextRelay
Agent A → push() → "https://.../pull/uuid"Agent B → pull(url)
                            ~80 chars · ~0 tokens · 75 ms

Two ways to run it

Same SDK, same protocol, same source. Pick the deployment that fits.

Self-host (open source)

Clone the repo, deploy the Worker to your own Cloudflare account. You own the data, the keys, the operational surface — and pay only for Cloudflare usage (the free tier covers 100K requests/day).

  • MIT license, no vendor lock-in
  • Runs on your Cloudflare account
  • No API keys, no quota
  • You handle ops
$ git clone github.com/cmhashim/ContextRelay
$ cd ContextRelay/api && wrangler deploy
Self-host guide on GitHub →

Managed cloud

We run the Worker for you on the global edge. You get an API key, a dashboard, metered usage, quotas, and zero ops. Same SDK as the open-source path — just point it at the managed URL.

  • Free tier: 1K pushes / 10K pulls per month
  • Dashboard for keys and usage
  • Quota enforcement and billing
  • We handle ops
$ pip install contextrelay
$ export CONTEXTRELAY_API_KEY=cr_live_...
Get a free API key →
pip install contextrelay
·pypi.org/project/contextrelay ↗

How it works

Three lines replace a token tax. Works across any agent, any LLM provider, any runtime.

01
Push
Agent A uploads the payload to the Cloudflare edge. Gets back a short URL pointer. Takes ~250 ms for 125 KB.
02
Relay the URL
The orchestrator passes the URL to Agent B — not the data. The URL is ~80 characters; the payload stays at the edge.
03
Pull on demand
Agent B fetches the full payload directly from the URL. ~75 ms for 125 KB. Works from any runtime — Python, Node, curl.

Everything your pipeline needs

Sub-100 ms pulls
Cloudflare Workers + KV across 300+ PoPs. Your context lives milliseconds from wherever your agents run.
🔒
Client-side encryption
Opt-in Fernet E2EE. The key lives in the URL fragment and is never sent to any server. Cloudflare sees only ciphertext.
📡
WebSocket pub/sub
Subscribe a channel, receive pointer URLs in ~20 ms when any agent pushes. No polling, no callbacks to manage.
🔍
Peek before you pull
Read metadata — size, tags, TTL — without downloading the payload. Route or skip without touching your token budget.
🔗
Any LLM, any provider
Claude, GPT-4, Mistral, Gemini — the URL is just a string. Pass it anywhere without touching the payload.
🛠️
MCP native tools
Register as an MCP server. Claude Desktop and Claude Code get push_context, peek_context, pull_context natively.

Real use cases

Copy-paste examples for the most common multi-agent patterns.

Try it now

Claude plans. Mistral builds. Channels coordinate them.

Both agents share one API key and subscribe to the same channels. Claude pushes the architecture to task-assigned — Mistral's subscriber fires in ~20 ms, implements the codebase, and publishes back to task-done. No URL to paste between processes.

shared API key ·same channels ·autonomous
Claude Opus design → push(arch, channel="task-assigned")
↓ subscriber fires in ~20 ms
Mistral Large pull(url) → implement → push(code, channel="task-done")
↓ architect notified
Claude Opus receives implementation · reviews · done
1
Terminal 1 — Start the Mistral engineer
Subscribes to task-assigned · waits for Claude
mistral_engineer.py
# mistral_engineer.py — run this first in a terminal
import os
from mistralai import Mistral
from contextrelay import ContextRelay

relay   = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
mistral = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

print("Mistral engineer ready — subscribed to 'task-assigned'")

def on_task(url: str):
    print(f"  [task-assigned] architecture received")
    architecture = relay.pull(url)
    print(f"  [task-assigned] {len(architecture):,} chars — implementing...")

    code = mistral.chat.complete(
        model="mistral-large-latest",
        messages=[{
            "role": "user",
            "content": (
                "You are a senior engineer. Implement this architecture as "
                "complete, runnable code. Every file. No placeholders.\n\n"
                + architecture
            ),
        }],
    ).choices[0].message.content

    relay.push(code, channel="task-done", metadata={"role": "implementation"})
    print("  [task-done] implementation published — architect notified")

relay.subscribe("task-assigned", on_task)  # blocks, waits for Claude
$ python3 mistral_engineer.py
Mistral engineer ready — subscribed to 'task-assigned'
waiting...
2
Claude Code — paste the architect prompt
Pushes to task-assigned · Mistral fires automatically
Claude Code prompt
You are a senior software architect in an autonomous pipeline.
Your engineer (Mistral) is already subscribed to channel "task-assigned".
The moment you push your architecture there, Mistral implements it — no human step.

Task: Design a production-ready FastAPI task management API.
Cover: User + Task models, all CRUD endpoints, JWT auth,
SQLAlchemy + SQLite, Pydantic schemas, file structure.

Steps:
1. Write the complete architecture document
2. Use push_context — set channel="task-assigned"
   Mistral's subscriber fires automatically on push
3. Subscribe / poll "task-done" to receive the implementation back

Start now. The pipeline is fully autonomous.
[architect] Architecture pushed → task-assigned
[engineer] Task received — implementing...
[engineer] Implementation published → task-done
[architect] Implementation received ✓
One API key. Two agents.
Both Claude and Mistral authenticate with your ContextRelay key. No shared credentials between models — just a shared channel namespace.
Zero token overlap.
Mistral never sees Claude's conversation. Claude never sees Mistral's. They exchange one URL — ~20 tokens — not 8,000 tokens of context.
Scale to N agents.
Add a reviewer on task-done, a QA agent on task-review, a deploy agent on task-approved. Same pattern.

Who it's for

ContextRelay solves a narrow problem well. If your work fits one of these patterns, it'll save you real money and ops.

Multi-agent platforms
Cross-LLM orchestration
You wire Claude, GPT, Mistral, or Gemini together. Pasting 50 KB between them is your biggest token bill — and pub/sub channels save you from polling glue code.
AI devtools
Cross-process context
You build products that drive Claude Code, Cursor, or local agents from a wider workflow. You need something cleaner than stdout pipes or scratch files to move state around.
Research labs
Agent swarms and experiments
You run 5+ agents that produce large intermediate state. Ad-hoc Redis or shared filesystems work but don't scale to encrypted handoffs across machines or providers.

Why ContextRelay

Honest comparison: ContextRelay does ephemeral, encrypted handoffs with channels. It is not a vector store, not a long-term memory layer, and not a general-purpose message broker. It sits next to those tools, not on top of them.

ContextRelayMem0Letta / MemGPTRedis pub/subLangChain Memory
Best forCross-provider handoffs & coordinationSemantic recall over user historyStateful long-running agentsReal-time messaging in your stackSingle-chain conversation context
StorageEphemeral KV (24 h)Vector index (persistent)Tiered memoryIn-memory + persistenceIn-process
Cross-LLMYes — provider-agnosticYesMostly framework-boundYesLangChain-bound
Pub/sub channelsYes (~20 ms fan-out)NoNoYesNo
End-to-end encryptionYes (fragment-key)NoNoTLS onlyNo
Self-hostOne wrangler deployCustom infraCustom infraStandard opsN/A (library)
Hosted optionFree / $29 / $99Mem0 cloudLetta CloudMany vendorsN/A
Use the right tool. If you need persistent semantic recall, use Mem0 or Letta. If you need a durable queue, use Redis or SQS. ContextRelay's sweet spot is the moment you have two agents on different providers, a 50 KB payload, and want it to cost ~20 tokens — with optional encryption and channel coordination on top.

Simple pricing

Free to start. No credit card required. Or self-host for free, forever.

Free
$0/mo
  • 1,000 pushes / mo
  • 10,000 pulls / mo
  • 2 API keys
Start free
Pro
$29/mo
  • 100,000 pushes / mo
  • 1M pulls / mo
  • 10 API keys
Get Pro
Team
$99/mo
  • 1M pushes / mo
  • 10M pulls / mo
  • 100 API keys
Get Team

Pick your path

Self-host on your Cloudflare account, or let us run it for you. Same SDK either way.