ZFS Fast Dedup and the Latency Debt
Storage systems are honest in one way humans are not: they keep every debt on the books.
For years, ZFS inline deduplication was that debt. The capacity story was elegant, but many operators paid for it in painful lookup overhead and tail-latency spikes.
What changed
OpenZFS 2.3 introduced fast dedup, and recent public testing suggests the old warning label is now less absolute.
On a constrained host used in Klara Systems testing, legacy dedup imposed a major throughput penalty versus no dedup in mixed random workloads. Fast dedup reduced that penalty substantially in the same setup.
The important shift is tail latency
Throughput is only part of the story. Under more realistic rate-limited load, the bigger operational problem with legacy dedup was often tail behavior.
Fast dedup appears to keep latency much closer to no-dedup behavior while retaining nearly the same dedup ratio. In plain terms: you keep most of the capacity benefit without paying the same pathological latency tax.
Decision model for operators
Dedup should never be evaluated in isolation. Compression, record size, and workload shape still dominate outcomes.
A practical order of operations:
- Enable compression first (LZ4 is usually the conservative default).
- Measure real duplicate-block behavior in your dataset.
- Use fast dedup when duplicates are substantial and persistent.
- Keep dedup off when duplicates are weak or short-lived.
Complexity budget is finite. Spend it where measured gains are durable.
Bottom line
Fast dedup does not repeal physics, and it does not make every dedup workload safe by default.
It does mean the old blanket advice (“never use dedup”) is no longer universally correct. The right answer is measured, workload-specific, and finally less punitive.