Runnable end-to-end report combining narrative, code, and inline figures:
data and .chord format, transformer working principle, two-stage training
curves, perplexity (3.58 -> 2.15), distribution-shift plot with a reading
legend, qualitative examples, and a generation demo. Written in a
first-person student voice.
- CLAUDE.md: report is now a Jupyter notebook; GOST formatting dropped
- requirements.txt: add nbconvert + ipykernel (optional, for the notebook)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Single-page form wrapping src.generate.generate_period: pick model, mode,
key, style, function, time, sampling params and optional prefix; returns
the chord grid plus downloadable .chord and .mid files. Russian usage
instructions are embedded on the same page.
Auto-length output is capped at 16 bars (the period maximum) so a model
that never emits EOS can't run away into dozens of NC/hold bars.
Added per the author's explicit request — web UI was previously out of
scope; updated CLAUDE.md and README accordingly. Choices for style/
function/time are derived from VOCAB so the form can't drift from the
tokenizer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- scripts/run_pretrain.py: single-command pre-training runner with
timing estimate, loss-curve plot (matplotlib), and per-epoch report.
Sets max_seq_len=256 (McGill sequences max out at 195 tokens, ~4x
faster attention than the 512 default).
- src/train.py: normalise --output so pretrained.pt and pretrained both
produce pretrained.pt + pretrained.log.csv (not pretrained.pt.log.csv).
Serialize Path fields as strings in checkpoint to satisfy weights_only.
- requirements.txt: drop unused pandas/music21, add mido (pretty_midi dep).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
src/tokenizer.py:
- parse_chord_file(Path) → ChordPeriod: reads header + bar body, strips //
comments, validates bar position counts and chord symbols, raises
ChordFormatError with filename and bar number on any violation.
- transpose_to_canonical(ChordPeriod) → ChordPeriod: shifts all chord roots
and bass notes by the semitone offset to C major / A minor; fast-path
returns the original object when shift == 0.
tests/test_chord_file_parser.py: 39 tests covering parsing of 4 valid fixtures
(C major, F# major, B minor, G# minor), error messages for 2 invalid
fixtures, and transposition correctness including slash chord root+bass.
tests/fixtures/: 6 .chord fixture files (4 valid, 2 invalid).
requirements.txt: pinned to current latest stable versions
(torch 2.12.0, music21 10.1.0, pretty_midi 0.2.11, matplotlib 3.10.9,
numpy 2.4.6, pandas 3.0.3, pytest 9.0.3); Python >= 3.11 noted.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>