Commit Graph

2 Commits

Author SHA1 Message Date
H1K0 03b464973a feat: write training report to file instead of stdout
pretrain.py -> checkpoints/pretrained.report.txt
train.py    -> checkpoints/finetuned.report.txt

Single-line [report] saved -> <path> printed to stdout instead.
Also fix arrow character incompatible with Windows cp1251 console.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 12:40:44 +03:00
H1K0 632407ebef refactor: split training scripts into pretrain.py and train.py
- scripts/run_pretrain.py -> scripts/pretrain.py: pre-trains on McGill
  corpus (data/processed/mcgill/), saves checkpoints/pretrained.pt.
- scripts/train.py: rewritten as high-level fine-tune wrapper; loads
  pretrained.pt, trains on data/processed/user/, saves finetuned.pt.
  Both scripts include timing estimate, loss-curve plot, per-epoch report,
  and --skip-training flag.
- README: updated section 7 to reflect new script names and separate
  data directories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 12:35:23 +03:00