H1K0 9216a8687f feat(backend): duplicate pairs, dismissals, and merge resolution
Adds the duplicate-detection backend on top of perceptual hashing:

- Two tables (edited into the original migrations): data.duplicate_pairs holds
  precomputed near-duplicate candidates (rebuilt wholesale by the rescan), and
  data.duplicate_dismissals is a global "not a duplicate" overlay that survives
  rescans. New audit actions file_merge / duplicate_dismiss.
- DuplicateService:
  - Rescan builds every pair within DUPLICATE_HASH_THRESHOLD via a BK-tree over
    the perceptual hashes and replaces the pairs table. This is the only thing
    that populates pairs, so GET never compares all-vs-all (scales to 110k+).
  - Clusters reads the precomputed pairs (ACL-filtered, non-trashed, non-
    dismissed), groups them into connected components via union-find, and
    paginates whole clusters.
  - Resolve merges a pair field-by-field: each scalar from keep or discard,
    metadata keep/discard/shallow-merge, tags/pools keep or union; then trashes
    the discarded file. Enforces edit ACL on both.
  - Dismiss records a canonical pair (view ACL on both).
- Endpoints under /files: GET /files/duplicates, POST /files/duplicates/dismiss,
  POST /files/duplicates/resolve (registered before /:id to avoid collision).
  Plain delete reuses /files/bulk/delete.
- Repo support: ListMissingPHash, ListAllPHashes, CopyPoolMemberships, plus the
  DuplicatePairRepo (ReplaceAll via COPY, ListVisible) and DismissalRepo.

Unit tests cover the BK-tree pairing, union-find clustering, metadata merge and
field validation; an integration test covers rescan -> list -> merge -> dismiss
(including that a dismissal survives a re-rescan).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:42:37 +03:00

Tanabata File Manager

A multi-user, tag-based web file manager for images and video. Go + Gin backend (Clean Architecture, pgx, goose migrations), SvelteKit SPA frontend, PostgreSQL, JWT auth — shipped as a single Docker image that serves both the API and the built SPA on one port.

Documentation

Quick start

cp .env.example .env        # then edit the secrets (JWT_SECRET, ADMIN_PASSWORD, …)
docker compose up -d --build

By default this runs the app plus a bundled PostgreSQL container (COMPOSE_PROFILES=with-db). To point at a Postgres already on the host, set COMPOSE_PROFILES= empty and aim DATABASE_URL at host.docker.internal. See .env.example for the full matrix.

The app is published on 127.0.0.1 only and expects a reverse proxy in front (see below). The default port is 42776 — the sum of the Unicode code points of 七夕.

Reverse proxy (nginx)

The container publishes its port on loopback (127.0.0.1:${APP_PORT}:42776 in docker-compose.yml), so a reverse proxy on the host terminates TLS and forwards to it. Three settings matter for this app:

  1. client_max_body_size — uploads go up to MAX_UPLOAD_BYTES (500 MiB by default). nginx caps request bodies at 1 MiB out of the box, so without this every large upload fails with 413.
  2. Forwarded headers — the app trusts X-Forwarded-For only from the hops in TRUSTED_PROXIES (default: loopback + Docker bridge ranges) and keys its login/refresh rate limiter on the resulting client IP. If the proxy doesn't send the header, every request looks like it comes from the proxy and shares one rate-limit bucket.
  3. Streaming for big media — turning request/response buffering off lets large uploads stream straight to the app and lets video range-seeks work without nginx spooling whole files to disk first.
server {
    listen 443 ssl;
    server_name tanabata.example.com;

    # ssl_certificate / ssl_certificate_key ... (e.g. from certbot)

    # Match MAX_UPLOAD_BYTES (500 MiB default); nginx defaults to 1m → 413.
    client_max_body_size 512m;

    location / {
        proxy_pass http://127.0.0.1:42776;   # APP_PORT
        proxy_http_version 1.1;

        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Stream large uploads/downloads instead of buffering to disk; keeps
        # video range-seek responsive. Scope these to file/preview locations
        # instead if you'd rather keep buffering for small JSON responses.
        proxy_request_buffering off;
        proxy_buffering         off;
        proxy_read_timeout      300s;
        proxy_send_timeout      300s;
    }
}

If you run the app without a proxy and want it reachable on the LAN, drop the 127.0.0.1: prefix from the ports line in docker-compose.yml and adjust TRUSTED_PROXIES accordingly.

Development

# Backend
cd backend
go run ./cmd/server          # dev server
go test ./...                # all tests

# Frontend
cd frontend
npm run dev                  # Vite dev server
npm run build                # production build
npm run generate:types       # regenerate API types from openapi.yaml
S
Description
🎋Tanabata — web file manager with tags!
Readme 5.6 MiB
Languages
Go 50%
Svelte 38.6%
TypeScript 8.8%
PLpgSQL 1.4%
Dockerfile 0.5%
Other 0.7%