feat(backend): perceptual hashing for images and video
Adds a 64-bit dHash perceptual hash (internal/imagehash, built on the existing disintegration/imaging — no new dependency) and starts populating the long-unused data.files.phash column: - Upload sets phash inline for images (cheap, from the in-memory bytes). - Replace recomputes it from new content for images and clears it for anything else, so a stale hash never survives a content swap. - FileRepo.SetPHash sets/clears the hash (used by Replace and, later, the dedup backfill). - DiskStorage.VideoFrameMiddle extracts a frame from the middle of a clip (ffprobe duration -> ffmpeg -ss duration/2), avoiding the shared-intro collision a fixed early offset causes. It is a concrete method, not part of the storage port: only the dedup CLI needs it, keeping ffmpeg off the upload path. Video phashes are therefore computed by that CLI, not at upload time. - DUPLICATE_HASH_THRESHOLD config (default 10/64) for the later pair rescan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -63,6 +63,12 @@ type Config struct {
|
||||
// Import
|
||||
ImportPath string
|
||||
|
||||
// DuplicateHashThreshold is the maximum Hamming distance (out of 64) between
|
||||
// two perceptual hashes for the files to be treated as duplicate candidates.
|
||||
// Lower = stricter (fewer, more confident matches); higher = looser. Used only
|
||||
// by the dedup rescan that (re)builds data.duplicate_pairs.
|
||||
DuplicateHashThreshold int
|
||||
|
||||
// Static SPA. When set, the server serves the built frontend (and falls
|
||||
// back to index.html for client routes) on the same port as the API. Empty
|
||||
// in local development, where the Vite dev server serves the UI separately.
|
||||
@@ -176,6 +182,8 @@ func Load() (*Config, error) {
|
||||
|
||||
ImportPath: requireStr("IMPORT_PATH"),
|
||||
|
||||
DuplicateHashThreshold: parseInt("DUPLICATE_HASH_THRESHOLD", 10),
|
||||
|
||||
StaticDir: defaultStr("STATIC_DIR", ""),
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user