Adds the duplicate-detection backend on top of perceptual hashing:
- Two tables (edited into the original migrations): data.duplicate_pairs holds
precomputed near-duplicate candidates (rebuilt wholesale by the rescan), and
data.duplicate_dismissals is a global "not a duplicate" overlay that survives
rescans. New audit actions file_merge / duplicate_dismiss.
- DuplicateService:
- Rescan builds every pair within DUPLICATE_HASH_THRESHOLD via a BK-tree over
the perceptual hashes and replaces the pairs table. This is the only thing
that populates pairs, so GET never compares all-vs-all (scales to 110k+).
- Clusters reads the precomputed pairs (ACL-filtered, non-trashed, non-
dismissed), groups them into connected components via union-find, and
paginates whole clusters.
- Resolve merges a pair field-by-field: each scalar from keep or discard,
metadata keep/discard/shallow-merge, tags/pools keep or union; then trashes
the discarded file. Enforces edit ACL on both.
- Dismiss records a canonical pair (view ACL on both).
- Endpoints under /files: GET /files/duplicates, POST /files/duplicates/dismiss,
POST /files/duplicates/resolve (registered before /:id to avoid collision).
Plain delete reuses /files/bulk/delete.
- Repo support: ListMissingPHash, ListAllPHashes, CopyPoolMemberships, plus the
DuplicatePairRepo (ReplaceAll via COPY, ListVisible) and DismissalRepo.
Unit tests cover the BK-tree pairing, union-find clustering, metadata merge and
field validation; an integration test covers rescan -> list -> merge -> dismiss
(including that a dismissal survives a re-rescan).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tanabata File Manager
A multi-user, tag-based web file manager for images and video. Go + Gin backend (Clean Architecture, pgx, goose migrations), SvelteKit SPA frontend, PostgreSQL, JWT auth — shipped as a single Docker image that serves both the API and the built SPA on one port.
Documentation
openapi.yaml— full REST API specificationdocs/DEPLOY.md— production deploy (Gitea Actions → host)docs/GO_PROJECT_STRUCTURE.md— backend architecturedocs/FRONTEND_STRUCTURE.md— frontend architecture.env.example— every configuration variable, documented
Quick start
cp .env.example .env # then edit the secrets (JWT_SECRET, ADMIN_PASSWORD, …)
docker compose up -d --build
By default this runs the app plus a bundled PostgreSQL container
(COMPOSE_PROFILES=with-db). To point at a Postgres already on the host, set
COMPOSE_PROFILES= empty and aim DATABASE_URL at host.docker.internal. See
.env.example for the full matrix.
The app is published on 127.0.0.1 only and expects a reverse proxy in front (see below). The default port is 42776 — the sum of the Unicode code points of 七夕.
Reverse proxy (nginx)
The container publishes its port on loopback (127.0.0.1:${APP_PORT}:42776 in
docker-compose.yml), so a reverse proxy on the host
terminates TLS and forwards to it. Three settings matter for this app:
client_max_body_size— uploads go up toMAX_UPLOAD_BYTES(500 MiB by default). nginx caps request bodies at 1 MiB out of the box, so without this every large upload fails with413.- Forwarded headers — the app trusts
X-Forwarded-Foronly from the hops inTRUSTED_PROXIES(default: loopback + Docker bridge ranges) and keys its login/refresh rate limiter on the resulting client IP. If the proxy doesn't send the header, every request looks like it comes from the proxy and shares one rate-limit bucket. - Streaming for big media — turning request/response buffering off lets large uploads stream straight to the app and lets video range-seeks work without nginx spooling whole files to disk first.
server {
listen 443 ssl;
server_name tanabata.example.com;
# ssl_certificate / ssl_certificate_key ... (e.g. from certbot)
# Match MAX_UPLOAD_BYTES (500 MiB default); nginx defaults to 1m → 413.
client_max_body_size 512m;
location / {
proxy_pass http://127.0.0.1:42776; # APP_PORT
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Stream large uploads/downloads instead of buffering to disk; keeps
# video range-seek responsive. Scope these to file/preview locations
# instead if you'd rather keep buffering for small JSON responses.
proxy_request_buffering off;
proxy_buffering off;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
}
}
If you run the app without a proxy and want it reachable on the LAN, drop the
127.0.0.1: prefix from the ports line in
docker-compose.yml and adjust TRUSTED_PROXIES
accordingly.
Development
# Backend
cd backend
go run ./cmd/server # dev server
go test ./... # all tests
# Frontend
cd frontend
npm run dev # Vite dev server
npm run build # production build
npm run generate:types # regenerate API types from openapi.yaml