Files
tanabata/backend
H1K0 6e3e6a4194 feat(backend): dedup CLI (hash backfill + pairs rescan)
cmd/dedup is the offline maintenance tool for duplicate detection. It reuses the
server's config and runs two phases (both by default; -hashes / -pairs to pick):

- hashes: compute the perceptual hash of every live image/video missing one —
  images from their bytes, videos from a middle frame via DiskStorage.
  VideoFrameMiddle. Per-file failures are reported and counted, not fatal.
- pairs: rebuild data.duplicate_pairs from all current hashes (DuplicateService.
  Rescan).

Idempotent and safe to re-run: hashing only touches NULL phashes, the pairs
rebuild is a full replace. This is how video phashes and any backlog get
computed, and how newly uploaded duplicates become visible.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:46:40 +03:00
..