docs: add README, architecture, glossary, requirements; update CLAUDE.md

Add four Russian-language project documents: - README.md: user-facing guide (install, quick start, data prep, training, evaluation, limitations) - docs/architecture.md v1.0: system architecture, data flow diagrams, module interfaces, 7 architectural decision records, extension points - docs/glossary.md v1.0: musical, ML, and project-specific term definitions - docs/requirements.md v1.0: functional/non-functional requirements, acceptance criteria, four use-case scenarios Update CLAUDE.md with project name etymology (hamori / ハモリ) and rename repo root reference from chord-gen to hamori. Refine chord_format_spec.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:00:21 +03:00
parent 9929209bcf
commit 75fa07bf6c
6 changed files with 2312 additions and 5 deletions
@@ -4,11 +4,20 @@ This file gives Claude Code persistent context for the project. Read it before a

 ## Project overview

-**Goal.** Train a small autoregressive transformer to generate harmonic periods (4–16 bar chord progressions) in the author's compositional style. Coursework deliverable for an ML class at RTU MIREA; also intended as a working creative tool.
+**Name.** _hamori_ (Japanese ハモリ, "harmonization" in the sense of vocal
+harmony — adding a second voice to a melodic line). The name reflects the
+project's core idea: the model proposes harmonic ideas to complement a
+composer's existing intent, rather than writing music from scratch.
+
+**Goal.** Train a small autoregressive transformer to generate harmonic
+periods (4–16 bar chord progressions) in the author's compositional style.
+Coursework deliverable for an ML class at RTU MIREA; also intended as a
+working creative tool.

 **Unit of generation.** A single closed harmonic phrase (a "period"), not a full song.

 **Pipeline.**
+
 1. Hand-transcribe own compositions from REAPER DAW projects into `.chord` text files.
 2. Parse `.chord` → factorized token sequences.
 3. Pre-train on a public corpus (McGill Billboard or similar).
@@ -34,7 +43,7 @@ Avoid heavy abstractions. This is coursework, not a production system. Prefer si
 ## Repository layout

 ```
-chord-gen/
+hamori/
 ├── CLAUDE.md                          ← this file
 ├── README.md
 ├── requirements.txt
@@ -88,6 +97,7 @@ The authoritative specification is in `docs/chord_format_spec.md`. **Always read
 ## Model

 A small autoregressive transformer:
+
 - Layers: 2–4
 - d_model: 128–256
 - Heads: 4–8
@@ -111,6 +121,7 @@ Pre-training uses the full public corpus. Fine-tuning uses the own corpus with a
 ## Evaluation

 For the report:
+
 1. **Perplexity** on the holdout set, comparing pre-trained baseline vs fine-tuned.
 2. **Distribution shift plots** — histograms over chord qualities, extension presence, inversion frequency, root motion intervals — showing how fine-tuning moves the distribution toward the author's corpus.
 3. **Qualitative cherry-picked generations** — 3 examples with the same seed/prefix, generated by baseline vs fine-tuned, rendered to MIDI.