cmd/dedup is the offline maintenance tool for duplicate detection. It reuses the
server's config and runs two phases (both by default; -hashes / -pairs to pick):
- hashes: compute the perceptual hash of every live image/video missing one —
images from their bytes, videos from a middle frame via DiskStorage.
VideoFrameMiddle. Per-file failures are reported and counted, not fatal.
- pairs: rebuild data.duplicate_pairs from all current hashes (DuplicateService.
Rescan).
Idempotent and safe to re-run: hashing only touches NULL phashes, the pairs
rebuild is a full replace. This is how video phashes and any backlog get
computed, and how newly uploaded duplicates become visible.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>