Engineering2025-11-175 min read

Sub-Millisecond Compression at the Cloudflare Edge

Running compression at the edge means it happens close to your data with negligible latency. Here is how smoltext does it.

smoltextCloudflare edgelow latencyedge compressionsub-millisecond

Why Location Matters

Compression that lives in a distant region adds a round trip to every operation. By running at the Cloudflare edge, smoltext performs compression close to where your requests already are, so the added latency is dominated by your existing network path rather than a detour to a central service.

Sub-Millisecond Work

The compression itself — semantic JSON parsing, codebook lookup, dictionary deflate — completes in sub-millisecond time for small payloads. There is little data to process, the codebook is resident, and the algorithms are tuned for the small-payload band. The work is not the bottleneck; the network is.

The Edge Advantage for Small Data

Small payloads are perfect for edge compression for a structural reason: the work is bounded and fast, so it slots into a request path without buffering or backpressure. You would not run a heavy adaptive compressor on a large file at the edge, but compressing a 60-byte record is trivial there.

Integration Shapes

There are two common ways to use the edge API:

Direct call: your service POSTs records to https://api.smoltext.sprapp.com/v1/compress and stores the result. Simplest to adopt.
In-path at the edge: compression happens inside an edge function in your own request flow, so records are already compact before they leave the edge.

Both keep the compression close to the data.

Keeping the Codebook Resident

A key reason smoltext is fast is that the trained codebook and shared dictionary live at the edge, not inside each message. Because they are resident, there is no per-request cost to load them and no per-message cost to store them. Every compression call reuses the same warm structures.

Latency Budgeting

For latency-sensitive paths, the right move is batching: amortize the single round trip across many records. Compressing one record and compressing a thousand records in one call cost nearly the same network time, so batching effectively hides the round trip.

When the Edge Is Not the Right Place

If your data originates far from any edge — deep inside a backend that never touches the edge network — the round trip out and back may not be worth it for a single small record. In that case, batch aggressively or compress at your storage boundary instead. Measure the end-to-end latency on your path.

The Net Effect

Edge placement plus sub-millisecond compression means smoltext adds compression to your pipeline without becoming a performance tax. The savings on storage and egress come essentially for free on the latency budget — which is exactly what you want from a byte-reduction layer.

Written bySPRAPP Engineering

Sub-Millisecond Compression at the Cloudflare Edge

Why Location Matters

Sub-Millisecond Work

The Edge Advantage for Small Data

Integration Shapes

Keeping the Codebook Resident

Latency Budgeting

When the Edge Is Not the Right Place

The Net Effect

Tags

Sub-Millisecond Compression at the Cloudflare Edge

Why Location Matters

Sub-Millisecond Work

The Edge Advantage for Small Data

Integration Shapes

Keeping the Codebook Resident

Latency Budgeting

When the Edge Is Not the Right Place

The Net Effect

Tags