Commit Graph

3 Commits

Author SHA1 Message Date
H1K0 632407ebef refactor: split training scripts into pretrain.py and train.py
- scripts/run_pretrain.py -> scripts/pretrain.py: pre-trains on McGill
  corpus (data/processed/mcgill/), saves checkpoints/pretrained.pt.
- scripts/train.py: rewritten as high-level fine-tune wrapper; loads
  pretrained.pt, trains on data/processed/user/, saves finetuned.pt.
  Both scripts include timing estimate, loss-curve plot, per-epoch report,
  and --skip-training flag.
- README: updated section 7 to reflect new script names and separate
  data directories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 12:35:23 +03:00
H1K0 555205b7d2 docs: actualize vocab size (81→85), spec version (2.0→2.2), style tag (user→H1K0)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 03:23:55 +03:00
H1K0 75fa07bf6c docs: add README, architecture, glossary, requirements; update CLAUDE.md
Add four Russian-language project documents:
- README.md: user-facing guide (install, quick start, data prep, training,
  evaluation, limitations)
- docs/architecture.md v1.0: system architecture, data flow diagrams,
  module interfaces, 7 architectural decision records, extension points
- docs/glossary.md v1.0: musical, ML, and project-specific term definitions
- docs/requirements.md v1.0: functional/non-functional requirements,
  acceptance criteria, four use-case scenarios

Update CLAUDE.md with project name etymology (hamori / ハモリ) and rename
repo root reference from chord-gen to hamori. Refine chord_format_spec.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:00:21 +03:00