One-time migration from the old Python/Flask Tanabata DB into the new core/data/acl/activity schema. - transform.sql: reads a `legacy` schema and writes the new one in a single, idempotent transaction. Remaps user/mime ids (uuid -> smallint by name), inverts is_private -> is_public, lifts EXIF out of files.metadata into the exif column, preserves pool hierarchy/created under metadata, synthesises file_pool ordering, derives acl object types, sanitises colors/notes. - migrate.sh: links the new DB to the old one via postgres_fdw, imports the old public schema as `legacy`, runs the transform, tears the link down. - README.md: mapping table, decisions/lossy points, and the separate physical-blob copy step. - docs/reference/schema.sql: the old DB schema the migration is built from (referenced by the README). Verified end-to-end on PostgreSQL 16 (synthetic legacy data, all transformations and idempotency checked). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
5.0 KiB
Legacy data migration
Moves data from the old Tanabata database (the Python/Flask version, schema
in docs/reference/schema.sql) into the
new core / data / acl / activity schema.
transform.sql— the actual data transformation. Reads alegacyschema (the old tables) and writes the new schema, in one transaction. Idempotent.migrate.sh— links the new DB to the old one viapostgres_fdw, imports the oldpublicschema aslegacy, runstransform.sql, then removes the link. The old DB is only read.
Tested end-to-end against PostgreSQL 16 (schema applied, synthetic legacy data, all transformations + idempotency verified).
Prerequisites
- The new schema exists and is seeded — start the app once (it runs the
goose migrations incl.
007_seed_data), or run goose manually. NEW_DSNconnects as a role allowed toCREATE EXTENSION postgres_fdw(a superuser — the compose Postgres'POSTGRES_USERis one).- The new Postgres server can reach the old DB host over the network.
psqlon PATH.
Run
cd scripts/migrate-legacy
NEW_DSN='postgres://tanabata:PASS@localhost:42777/tanabata' \
OLD_HOST=192.168.1.10 OLD_PORT=5432 OLD_DB=tfm \
OLD_USER=hiko OLD_PASSWORD=SECRET \
./migrate.sh
It prints the source (legacy) row counts, then the resulting new-schema counts.
Re-running is safe — ON CONFLICT DO NOTHING everywhere means a second run only
fills in what is missing.
Without postgres_fdw
transform.sql only needs the old tables to be visible as a legacy schema. If
you'd rather not use fdw, load the old dump into a schema named legacy in the
new database by whatever means, then run just the transform:
psql "$NEW_DSN" -v ON_ERROR_STOP=1 -f transform.sql
What gets migrated, and how
Old (public) |
New | Notes |
|---|---|---|
users |
core.users |
id uuid → smallint (remapped by unique name); can_edit → can_create; is_blocked = false |
mime |
core.mime_types |
id uuid → smallint (remapped by name); types not already seeded are added |
categories |
data.categories |
id kept; is_private → is_public (inverted) |
tags |
data.tags |
id + category_id kept; inverted privacy |
autotags |
data.tag_rules |
parent_id → when_tag_id, child_id → then_tag_id |
files |
data.files |
id kept; datetime → content_datetime; orig_name → original_name; EXIF lifted from metadata->'exif' into the exif column, the rest stays as user metadata |
file_tag |
data.file_tag |
orphan rows skipped |
pools |
data.pools |
id kept; parent_id + created preserved under metadata (see below) |
file_pool |
data.file_pool |
position synthesised (gapped 1000s, ordered by file id) |
acl |
acl.permissions |
object type derived by locating the object; read/write → can_view/can_edit |
file_views |
activity.file_views |
datetime → viewed_at |
Throughout: empty notes ('') → NULL; colours that aren't 6-hex are set to
NULL (the old CHECK was NOT VALID, so bad values could exist).
Decisions / lossy points
- Passwords are copied verbatim. If the old hashes are bcrypt (as the new app expects) logins keep working; otherwise affected users need a reset.
createdtimestamps on categories/tags/files are dropped — their UUIDv7 ids already encode creation time. Pools use random v4 ids, so theircreated(and the dropped pool hierarchyparent_id) are preserved underdata.pools.metadataaslegacy_created/legacy_parent_id.file_poolordering: the old schema stored none, so position is generated from file-id order (≈ chronological) with gaps of 1000.- Not migrated:
sessions/user_agents— the new app uses JWTs, so users simply log in again. There were no audit-log / pool-view / tag-use tables in the old schema, so those start empty.phashandis_deletedare new (NULL/false).
Physical files (separate, manual)
The script migrates the database only. File blobs must be copied too. The
new layout stores originals at FILES_PATH/{uuid} with no extension;
thumbnails/previews are regenerated on demand, so don't copy those. Because ids
are preserved, the old {uuid}.{ext} files map 1:1 — just strip the extension:
OLD_FILES=/srv/old-tanabata/files # old originals ({uuid}.{ext})
NEW_FILES=/var/lib/tanabata/files # new FILES_PATH
for src in "$OLD_FILES"/*; do
id="$(basename "$src")"; id="${id%.*}" # uuids contain no dots
cp -n "$src" "$NEW_FILES/$id"
done
# Make them readable by the container user (uid/gid 42776):
chown -R 42776:42776 "$NEW_FILES"