xumux: multiplexing over anything

Share
xumux: multiplexing over anything

TL;DR: Multiplexing combines multiple signals or data streams into a single transmission so they can be sent together efficiently. At the receiving end, everything is separated again (a process called demultiplexing). Deft needed a multiplexing layer for VROOM and we realized it was useful on its own. So we created xumux — a transport-agnostic protocol for multiplexing typed channels over any connection. 8-byte fixed header, works over WebRTC, WebSocket, QUIC, TCP, or pipes. VROOM builds on it. Anything else can too.


How We Got Here

When I wrote about VROOM, the multiplexing layer underneath was called OpenMux and lived inside the VROOM spec as an implementation detail. It worked fine — until we started wanting to use that same multiplexing for things that had nothing to do with remote sessions.

Classic mistake: embedding a general-purpose solution inside a specific-purpose protocol. So we pulled it out, gave it a name, gave it its own spec, and stopped pretending that channel multiplexing was VROOM’s job.

What Is xumux?

xumux (pronounced “zuh-mux”) is an open protocol for multiplexing typed, named channels over any reliable or semi-reliable transport.

  • 8-byte fixed header. Channel ID (2 bytes), type (1 byte), flags (1 byte), payload length (4 bytes). Simple enough to implement in an afternoon.
  • Up to 65,535 channels per connection, each with its own type namespace and reliability guarantees.
  • Transport agnostic. Same framing over WebRTC DataChannels, WebSocket, QUIC, TCP, or stdio. Swap the transport, keep the protocol.
  • Structured handshake. CBOR-encoded capability negotiation before application data flows.
  • Fragmentation with interleaving. Large messages on one channel don’t block latency-sensitive traffic on others.

xumux doesn’t define what you send — just how you multiplex it. Everything above the framing layer is your protocol’s problem.

Why Not Just Keep It in VROOM?

Because VROOM has opinions about remote sessions — terminal capability negotiation, View/Interact/Voice modes, PTY management. Those are good opinions for what VROOM does. They’re irrelevant opinions to impose on someone who just wants to multiplex three data streams over a WebSocket.

The separation made both things better. VROOM got shorter and more focused on session semantics. xumux got general enough to be useful for things I haven’t thought of yet.

Why This Matters for Agentic Communications

Agent-to-agent communication is about to outgrow HTTP request/response. The emerging pattern is agents maintaining many simultaneous channels with different reliability requirements over single persistent connections.

A coding agent talking to a remote environment might need:

  • A reliable ordered channel for terminal I/O
  • An unreliable channel for screen frames (frame 48 supersedes frame 47)
  • A reliable channel for file sync
  • A reliable channel for the control plane
  • An unreliable channel for telemetry

Without multiplexing, that’s five separate connections to manage and reconnect independently. With xumux, it can be one connection with five channels.

Scale to an agent orchestrator coordinating 20 worker agents and the difference gets dramatic. Connection establishment has real costs — TLS handshakes, authentication, state management. xumux pays those costs once per connection. New channels open with a single message in microseconds.

Backpressure matters too. When a “cancel this task” control message needs to arrive now, it shouldn’t wait behind a 4MB streaming transfer. Fragmentation with interleaving solves this — large payloads get split into fragments that interleave with urgent messages on other channels.

Most agent frameworks that hit this problem end up building bespoke multiplexing that they never document and that nothing else can interoperate with. xumux is an attempt at making that unnecessary.

Current State

xumux is at v0.1.0-draft. The spec is at xumux.org (redirects to GitHub).

The initial consumer is VROOM, which builds both VROOM-Graphical and VROOM-Terminal on top of xumux channels.

Transport bindings are defined for:

  • WebRTC DataChannels — most natural fit; per-channel reliability maps directly
  • WebSocket — channels share one stream; fragmentation provides interleaving
  • QUIC / WebTransport — channels map to QUIC streams; unreliable channels use datagrams
  • TCP / stdio — magic bytes (OMUX) identify the protocol on the wire

New transports (such as serial ports) can be added without touching the core spec.

It’s early. The spec is stable enough to build on, but er expect refinements as edge cases surface — particularly around per-channel flow control for high-channel-count scenarios. We're testing reference implementations in Python, TypeScript, and Go now.


xumux is MIT licensed and open at xumux.org. VROOM is at vroom.md. If you’re about to write your own channel framing format — maybe check this one first.