Commit Graph

3 Commits

Author SHA1 Message Date
H1K0 c56397df54 docs: add Russian project report notebook (notebooks/report.ipynb)
Runnable end-to-end report combining narrative, code, and inline figures:
data and .chord format, transformer working principle, two-stage training
curves, perplexity (3.58 -> 2.15), distribution-shift plot with a reading
legend, qualitative examples, and a generation demo. Written in a
first-person student voice.

- CLAUDE.md: report is now a Jupyter notebook; GOST formatting dropped
- requirements.txt: add nbconvert + ipykernel (optional, for the notebook)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 17:07:00 +03:00
H1K0 8f657ca916 scripts: add --mode finetune to make_colab_zip, add colab_finetune notebook
make_colab_zip.py now accepts --mode pretrain|finetune (default: pretrain).
Finetune mode bundles scripts/train.py + data/processed/user/{train,val}/*.pt
plus an optional --include-checkpoint flag for pretrained.pt.

notebooks/colab_finetune.ipynb covers the full Colab fine-tuning workflow:
upload zip → upload pretrained.pt → verify data → train → inspect → download.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 19:47:10 +03:00
H1K0 89770dd009 feat: add Colab bundle script and pre-training notebook
scripts/make_colab_zip.py packages src/, scripts/pretrain.py,
requirements.txt, and processed .pt files into hamori_colab.zip,
remapping data/processed/{train,val}/ -> data/processed/mcgill/{train,val}/
so pretrain.py finds the data without modification.

notebooks/colab_pretrain.ipynb guides through upload, extraction,
dependency install, training run, report display, and results download.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:00:03 +03:00