docs: add README, architecture, glossary, requirements; update CLAUDE.md
Add four Russian-language project documents: - README.md: user-facing guide (install, quick start, data prep, training, evaluation, limitations) - docs/architecture.md v1.0: system architecture, data flow diagrams, module interfaces, 7 architectural decision records, extension points - docs/glossary.md v1.0: musical, ML, and project-specific term definitions - docs/requirements.md v1.0: functional/non-functional requirements, acceptance criteria, four use-case scenarios Update CLAUDE.md with project name etymology (hamori / ハモリ) and rename repo root reference from chord-gen to hamori. Refine chord_format_spec.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -4,11 +4,20 @@ This file gives Claude Code persistent context for the project. Read it before a
|
||||
|
||||
## Project overview
|
||||
|
||||
**Goal.** Train a small autoregressive transformer to generate harmonic periods (4–16 bar chord progressions) in the author's compositional style. Coursework deliverable for an ML class at RTU MIREA; also intended as a working creative tool.
|
||||
**Name.** _hamori_ (Japanese ハモリ, "harmonization" in the sense of vocal
|
||||
harmony — adding a second voice to a melodic line). The name reflects the
|
||||
project's core idea: the model proposes harmonic ideas to complement a
|
||||
composer's existing intent, rather than writing music from scratch.
|
||||
|
||||
**Goal.** Train a small autoregressive transformer to generate harmonic
|
||||
periods (4–16 bar chord progressions) in the author's compositional style.
|
||||
Coursework deliverable for an ML class at RTU MIREA; also intended as a
|
||||
working creative tool.
|
||||
|
||||
**Unit of generation.** A single closed harmonic phrase (a "period"), not a full song.
|
||||
|
||||
**Pipeline.**
|
||||
|
||||
1. Hand-transcribe own compositions from REAPER DAW projects into `.chord` text files.
|
||||
2. Parse `.chord` → factorized token sequences.
|
||||
3. Pre-train on a public corpus (McGill Billboard or similar).
|
||||
@@ -34,7 +43,7 @@ Avoid heavy abstractions. This is coursework, not a production system. Prefer si
|
||||
## Repository layout
|
||||
|
||||
```
|
||||
chord-gen/
|
||||
hamori/
|
||||
├── CLAUDE.md ← this file
|
||||
├── README.md
|
||||
├── requirements.txt
|
||||
@@ -88,6 +97,7 @@ The authoritative specification is in `docs/chord_format_spec.md`. **Always read
|
||||
## Model
|
||||
|
||||
A small autoregressive transformer:
|
||||
|
||||
- Layers: 2–4
|
||||
- d_model: 128–256
|
||||
- Heads: 4–8
|
||||
@@ -111,6 +121,7 @@ Pre-training uses the full public corpus. Fine-tuning uses the own corpus with a
|
||||
## Evaluation
|
||||
|
||||
For the report:
|
||||
|
||||
1. **Perplexity** on the holdout set, comparing pre-trained baseline vs fine-tuned.
|
||||
2. **Distribution shift plots** — histograms over chord qualities, extension presence, inversion frequency, root motion intervals — showing how fine-tuning moves the distribution toward the author's corpus.
|
||||
3. **Qualitative cherry-picked generations** — 3 examples with the same seed/prefix, generated by baseline vs fine-tuned, rendered to MIDI.
|
||||
|
||||
Reference in New Issue
Block a user