How We Reduced Frontend Latency by Collapsing Chatty Request Chains
The Problem: Waterfall Latency
Modern pages often make a bunch of small API calls in sequence: profile → friends → profiles → notifications → greeting → etc. Each call adds a round trip (RTT). On real networks, ~100–200 ms of RTT per call adds up fast: a chain of dependent requests can add hundreds of milliseconds to multiple seconds of latency before any server work. (Our demo uses six calls as an example, but the approach applies to any number of calls.)
What Cap’n Web Is (and Why It Helps)
Cap’n Web is a JavaScript‑native RPC library with two key ideas:
- HTTP batch + pipelining: queue multiple calls and send them in one request. Promises act like stubs, so you can chain calls before the first resolves.
- Capability‑based RPC: pass references (objects with methods) instead of tokens or IDs. Least‑privilege by default.
It also supports long‑lived WebSocket sessions and bidirectional calls, but this post focuses on the one‑RTT HTTP batch because it solves the waterfall pain directly.
Architecture Comparison: Chained REST vs Batched RPC
Many frontend pages load data through a chain of dependent calls. To make the trade-off concrete, this post compares two implementation styles for the same data needs:
- Chained REST: multiple dependent HTTP requests executed one after another.
- Batched RPC: one HTTP request carrying multiple logical calls.
The key issue is round-trip overhead. As dependency depth increases, network latency often dominates total load time more than server compute does. Batching changes the shape of that cost by collapsing several round trips into one request boundary.
How the Batch Works
Client code starts a batch session, adds calls, and awaits once. Under the hood, Cap’n Web sends one HTTP request carrying all calls, and the server executes them (with pipelining support).
import { newHttpBatchRpcSession } from "capnweb";
const api = newHttpBatchRpcSession("/api");
// Queue calls without awaiting yet
const a = api.a();
const b = api.b();
const c = api.c();
const d = api.d();
const e = api.e();
const f = api.f();
// Send once, await once
const results = await Promise.all([a, b, c, d, e, f]);
In a batch, you can even use RpcPromises as parameters to other calls (promise pipelining). That lets you express dependent operations without additional round trips.
Minimal Worker Shape for Evaluation
A practical way to evaluate this architecture is to expose two equivalent service shapes from the same backend:
/rest/1…/rest/6: one JSON response per request for a chained REST flow./api: a Cap’n Web endpoint exposing six methods (a…f) in a batched call boundary.
import { RpcTarget, newWorkersRpcResponse } from "capnweb";
class DemoApi extends RpcTarget {
constructor(delayMs) { super(); this.delayMs = delayMs; }
async a() { await wait(this.delayMs); return { step: "a", at: Date.now() }; }
async b() { await wait(this.delayMs); return { step: "b", at: Date.now() }; }
async c() { await wait(this.delayMs); return { step: "c", at: Date.now() }; }
async d() { await wait(this.delayMs); return { step: "d", at: Date.now() }; }
async e() { await wait(this.delayMs); return { step: "e", at: Date.now() }; }
async f() { await wait(this.delayMs); return { step: "f", at: Date.now() }; }
}
In cross-origin testing environments, set Access-Control-Allow-Origin: * and Timing-Allow-Origin: * so browser timing data is available for analysis.
Expected Results
If RTT is ~120 ms and each call does ~120 ms of work:
- 6 sequential REST calls: ~6 × (RTT + work) ≈ ~1440 ms
- 1 batch (6 calls): ~1 × (RTT + work) ≈ ~240 ms
Parallel REST improves over sequential, but still pays multiple RTTs and adds head‑of‑line blocking. The batch sends once.
Trade‑offs and When to Use It
- Great for: page boot, dashboards, “fan‑out” reads, and chained calls (authenticate → me → greet).
- Consider WebSocket: for sustained interactions or server‑initiated callbacks.
- Error handling: await all promises you care about; un‑awaited calls won’t return results in the batch.
- Security: capability‑based design reduces token sprawl and scopes authority to the object you hold.
Try It Yourself
- Measure a representative page flow in your product and count dependent calls.
- Prototype a batched boundary for the same data needs and compare total time.
- Test under realistic latency to validate impact before rollout.
To build your own, see Cloudflare’s post: Cap’n Web: a new JavaScript RPC library.
