Moved data/processed/{train,val,holdout}/ → data/processed/mcgill/{train,val,holdout}/
so both corpora have their own namespace under data/processed/.
Updated PRETRAIN_DATA paths in make_colab_zip.py accordingly
(path remap workaround no longer needed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
make_colab_zip.py now accepts --mode pretrain|finetune (default: pretrain).
Finetune mode bundles scripts/train.py + data/processed/user/{train,val}/*.pt
plus an optional --include-checkpoint flag for pretrained.pt.
notebooks/colab_finetune.ipynb covers the full Colab fine-tuning workflow:
upload zip → upload pretrained.pt → verify data → train → inspect → download.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
scripts/make_colab_zip.py packages src/, scripts/pretrain.py,
requirements.txt, and processed .pt files into hamori_colab.zip,
remapping data/processed/{train,val}/ -> data/processed/mcgill/{train,val}/
so pretrain.py finds the data without modification.
notebooks/colab_pretrain.ipynb guides through upload, extraction,
dependency install, training run, report display, and results download.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>