Adds the duplicate-detection backend on top of perceptual hashing:
- Two tables (edited into the original migrations): data.duplicate_pairs holds
precomputed near-duplicate candidates (rebuilt wholesale by the rescan), and
data.duplicate_dismissals is a global "not a duplicate" overlay that survives
rescans. New audit actions file_merge / duplicate_dismiss.
- DuplicateService:
- Rescan builds every pair within DUPLICATE_HASH_THRESHOLD via a BK-tree over
the perceptual hashes and replaces the pairs table. This is the only thing
that populates pairs, so GET never compares all-vs-all (scales to 110k+).
- Clusters reads the precomputed pairs (ACL-filtered, non-trashed, non-
dismissed), groups them into connected components via union-find, and
paginates whole clusters.
- Resolve merges a pair field-by-field: each scalar from keep or discard,
metadata keep/discard/shallow-merge, tags/pools keep or union; then trashes
the discarded file. Enforces edit ACL on both.
- Dismiss records a canonical pair (view ACL on both).
- Endpoints under /files: GET /files/duplicates, POST /files/duplicates/dismiss,
POST /files/duplicates/resolve (registered before /:id to avoid collision).
Plain delete reuses /files/bulk/delete.
- Repo support: ListMissingPHash, ListAllPHashes, CopyPoolMemberships, plus the
DuplicatePairRepo (ReplaceAll via COPY, ListVisible) and DismissalRepo.
Unit tests cover the BK-tree pairing, union-find clustering, metadata merge and
field validation; an integration test covers rescan -> list -> merge -> dismiss
(including that a dismissal survives a re-rescan).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a 64-bit dHash perceptual hash (internal/imagehash, built on the existing
disintegration/imaging — no new dependency) and starts populating the long-unused
data.files.phash column:
- Upload sets phash inline for images (cheap, from the in-memory bytes).
- Replace recomputes it from new content for images and clears it for anything
else, so a stale hash never survives a content swap.
- FileRepo.SetPHash sets/clears the hash (used by Replace and, later, the dedup
backfill).
- DiskStorage.VideoFrameMiddle extracts a frame from the middle of a clip
(ffprobe duration -> ffmpeg -ss duration/2), avoiding the shared-intro collision
a fixed early offset causes. It is a concrete method, not part of the storage
port: only the dedup CLI needs it, keeping ffmpeg off the upload path. Video
phashes are therefore computed by that CLI, not at upload time.
- DUPLICATE_HASH_THRESHOLD config (default 10/64) for the later pair rescan.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tag pooled connections with application_name="tanabata-backend" via the
parsed pgxpool config, so the backend's sessions are identifiable in
pg_stat_activity and server logs. An application_name supplied in the
DSN (or PGAPPNAME) still takes precedence.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaces the old "untagged" sentinel tag with a proper per-file workflow
status: needs_review starts true on upload/import and is cleared by an
explicit action (no auto-clear on tagging). Surfaced as a filter token
(r=1 needs review, r=0 done) so it combines with tag/MIME conditions, and
toggled via POST /files/bulk/review (single id or many, edit-ACL enforced,
audit-logged as file_review).
needs_review lives on data.files (column added to the original 003 migration,
partial index in 006, action type seeded in 007).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CreateRule accepted apply_to_existing but ignored it, so enabling the
checkbox while creating a rule never retroactively tagged files already
carrying the when-tag — only activating an existing rule did. Extract the
retroactive expansion into TagRuleRepo.ApplyToExisting (reused by SetActive)
and call it from CreateRule when the rule is active, inside one transaction
so a file is never left half-tagged. Mirrors SetRuleActive semantics,
including following only active downstream rules.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PATCH "clear colour" path sent an empty string, which violates the
hex CHECK constraint and never falls back to the category colour. Map ''
to NULL via NULLIF in the tag insert/update so a cleared or omitted
colour is stored as NULL.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add POST /pools/{id}/views, mirroring the file-view endpoint: it
enforces view ACL and appends a row to activity.pool_views (viewed_at
defaults to statement_timestamp(), so each view is its own history row).
The table existed but nothing wrote to it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The category_name tag sort now breaks ties within a category by the
tag's own name (same direction), so tags group by category and read
alphabetically inside each group; uncategorized tags stay last.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Listing files with a tag filter now logs each referenced tag to
activity.tag_uses, flagging it included (positive) or excluded (negated
under an odd number of NOTs); the untagged pseudo-token is skipped. The
filter AST is reused to determine polarity, so grouped negations like
!(A|B) mark both tags excluded.
Recording happens only when a filter is first applied — not on cursor
pagination or an anchored return — so one browse counts once. The write
is best-effort and never fails the listing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The activity.file_views table existed but nothing ever wrote to it. Add a
POST /files/{id}/views endpoint: FileRepo.RecordView inserts a history row,
FileService.RecordView enforces view ACL first. The file viewer fires it
(fire-and-forget) when a file is opened, including while paging prev/next.
Documented in openapi.yaml; covered by TestRecordFileView (204 on view,
repeatable, 404 for unknown file).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Run gofmt -w across the backend, normalising the manually-aligned := blocks
to the gofmt standard. No code behaviour changes.
Add Prettier (+ prettier-plugin-svelte) to the frontend with the SvelteKit
default config (tabs, single quotes) so formatting is reproducible, then run
it over the whole tree. Add format / format:check npm scripts and a
.prettierignore (build output, generated schema.ts, static assets).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Listings returned every row regardless of ownership: GET /files, /tags,
/pools and /categories exposed other users' private items (while the
single-item GET correctly returned 403), and the pool file operations
(GET /pools/:id, /pools/:id/files, add/remove/reorder) skipped ACL
entirely, so any authenticated user could read and rewrite anyone's
private pool.
- List queries now filter to rows the caller may see (public, owned, or
granted can_view) via a shared SQL condition; admins bypass. The viewer
identity is taken from the request context by the service and passed to
the repository in the list params.
- Tag/Category/Pool single-item Get now enforce CanView (File already did).
- Pool Get/ListFiles require pool view; AddFiles/RemoveFiles/Reorder
require pool edit.
Adds regression tests for private-by-default listing (hidden / public /
granted / admin) and for pool operations rejecting a non-owner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reorder did DELETE all pool memberships then re-inserted only the passed
file_ids, so a paginated client that sent just the loaded pages silently
removed every other file from the pool. Reorder now places the requested
files in order and appends any members the request omitted (in their
current order), so a partial reorder is safe and correct.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The auth middleware trusted any unexpired, well-signed access token, so
logout, session termination and admin blocks had no effect until the
15-minute token expired. The middleware now validates that the token's
session is still active on every request (SessionRepo.GetByID), and
blocking a user deactivates all of their sessions, immediately revoking
their outstanding access tokens.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend PATCH /tags/{id}/rules/{then_id} to accept apply_to_existing bool.
When a rule is activated with apply_to_existing=true, a single recursive
CTE retroactively inserts the full transitive expansion of then_tag into
data.file_tag for all files already carrying when_tag:
WITH RECURSIVE expansion(tag_id) AS (
SELECT then_tag_id
UNION
SELECT r.then_tag_id FROM data.tag_rules r
JOIN expansion e ON r.when_tag_id = e.tag_id
WHERE r.is_active = true
)
INSERT INTO data.file_tag ... ON CONFLICT DO NOTHING
Changes:
- port/repository.go: add applyToExisting param to TagRuleRepo.SetActive
- db/postgres/tag_repo.go: implement recursive CTE retroactive apply
- service/tag_service.go: thread applyToExisting through SetRuleActive
- handler/tag_handler.go: parse apply_to_existing from PATCH body
- openapi.yaml: document apply_to_existing on PATCH endpoint
- integration test: add TestTagRuleActivateApplyToExisting covering
no-op when false, direct+transitive apply when true
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Make data.files.exif column nullable (was NOT NULL but service passes nil
for files without EXIF data, causing a constraint violation on upload)
- FileRepo.Create: include id in INSERT so disk storage path and DB record
share the same UUID (previously DB generated its own UUID, causing a mismatch)
- Integration test: use correct filter DSL format {t=<uuid>} instead of tag:<uuid>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add UserService (GetMe, UpdateMe, admin CRUD with block/unblock),
UserHandler (/users, /users/me), ACLHandler (GET/PUT /acl/:type/:id),
AuditHandler (GET /audit with all filters). Fix UserRepo.Update to
include is_blocked. Wire all remaining routes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add pool repo (gap-based position ordering, cursor pagination, add/remove/reorder
files), service, handler, and wire all /pools endpoints including
/pools/:id/files, /pools/:id/files/remove, and /pools/:id/files/reorder.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add category repo, service, handler, and wire all /categories endpoints
including list, create, get, update, delete, and list-tags.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
filter_parser.go — recursive-descent parser for the {token,...} DSL.
Tokens: t=UUID (tag), m=INT (MIME exact), m~PATTERN (MIME LIKE),
operators & | ! ( ) with standard NOT>AND>OR precedence.
All values go through pgx parameters ($N) — SQL injection impossible.
file_repo.go — full FileRepo:
- Create/GetByID/Update via CTE RETURNING with JOIN for one round-trip
- SoftDelete/Restore/DeletePermanent with RowsAffected guards
- SetTags: full replace (DELETE + INSERT per tag)
- ListTags: delegates to loadTagsBatch (single query for N files)
- List: keyset cursor pagination (bidirectional), anchor mode,
filter DSL, search ILIKE, trash flag, 4 sort columns.
Cursor is base64url(JSON) encoding sort position; backward
pagination fetches in reversed ORDER BY then reverses the slice.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AuditRepo.Log resolves action_type_id/object_type_id via SQL subqueries.
AuditRepo.List supports dynamic filtering by user, action, object type/ID,
and date range with COUNT(*) OVER() for total count.
AuditService.Log reads user from context, marshals details to JSON,
and delegates to the repo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add postgres ACLRepo (List/Get/Set) and ACLService with CanView/CanEdit
checks (admin bypass, public flag, creator shortcut, explicit grants)
and GetPermissions/SetPermissions for the /acl endpoints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
db/db.go: TxFromContext/ContextWithTx for transaction propagation,
Querier interface (QueryRow/Query/Exec), ScanRow generic helper,
ClampLimit/ClampOffset pagination guards.
db/postgres/postgres.go: NewPool with ping validation, Transactor
backed by pgxpool (BeginTx → fn → commit/rollback), connOrTx helper
that returns the active transaction from context or falls back to pool.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>