scripts/train.py: fix max_seq_len 256→320 (must match pretrained checkpoint);
increase epochs 15→50 and patience 5→10 to give the small corpus enough
gradient steps; reduce warmup 20→10 (was 22% of total steps).
scripts/generate.py: default to prepending the tonic chord when --prefix is
not given; add --no-tonic-anchor to opt out.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pretrain.py -> checkpoints/pretrained.report.txt
train.py -> checkpoints/finetuned.report.txt
Single-line [report] saved -> <path> printed to stdout instead.
Also fix arrow character incompatible with Windows cp1251 console.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scripts/run_pretrain.py -> scripts/pretrain.py: pre-trains on McGill
corpus (data/processed/mcgill/), saves checkpoints/pretrained.pt.
- scripts/train.py: rewritten as high-level fine-tune wrapper; loads
pretrained.pt, trains on data/processed/user/, saves finetuned.pt.
Both scripts include timing estimate, loss-curve plot, per-epoch report,
and --skip-training flag.
- README: updated section 7 to reflect new script names and separate
data directories.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AdamW + cosine-with-warmup schedule, PAD-ignoring cross-entropy, per-epoch
CSV logging, best-val-loss checkpointing, early stopping (patience=5).
Same script handles both pre-training and fine-tuning via --init-from.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>