84ba7b4743
- src/dataset.py: ChordDataset wrapping .pt files with pad/truncate - scripts/prepare_data.py: tokenize .chord to .pt with train/val/holdout split, logs token length stats and style/function distributions - src/external_converters/mcgill_to_chord.py: rewrite parser for real McGill v2 format (2-column annotation, each bar in its own pipe group, interval bass notation e.g. /5 and /b3) - .gitignore: exclude data/processed/train, val, holdout subdirectories - tests: 37 new tests for ChordDataset and converter (260 total, all pass) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 lines
205 B
Plaintext
10 lines
205 B
Plaintext
# artist: Test Artist
|
|
# title: Test Song
|
|
# metre: 4/4
|
|
# tonic: C
|
|
|
|
0.000000 silence
|
|
4.000000 A, verse, | C:maj | F:maj | G:7 | C:maj |
|
|
20.000000 B, chorus, | F:maj | C:maj | G:7 | C:maj |
|
|
36.000000 silence
|