Commit Graph

3 Commits

Author SHA1 Message Date
H1K0 3cd9c29d9f feat: extend time signature support to 9 metres (5/4, 7/4, 7/8, 9/8)
Add 5/4, 7/4, 7/8, 9/8 to _VALID_TIMES and VOCAB (TIME_* tokens).
Vocab size grows from 81 to 85 tokens. _parse_metre in the McGill
converter assigns subdivision=8 to 7/8 and 9/8. Spec bumped to v2.2.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 00:37:05 +03:00
H1K0 4fd8ece170 refactor: replace fixed STYLE_user with open-ended style tag system
- STYLE_user renamed to STYLE_H1K0 in VOCAB (author's personal tag)
- Style field now accepts any [A-Za-z][A-Za-z0-9_]* identifier in .chord files
- Unknown styles fall back to STYLE_other at tokenization time with a log warning
- Test fixtures updated to style: other; drop closed _VALID_STYLES frozenset
- Spec bumped to v2.1: documents open style field, fallback behaviour, and §5.7
  guide on registering a new style token

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 00:29:52 +03:00
H1K0 868af4ac42 feat: add vocabulary constants and tokenize/detokenize to tokenizer.py
Adds VOCAB (81 tokens), TOKEN_TO_ID, and ID_TO_TOKEN per spec §5.2.
tokenize_period() transposes to C/Am then emits BOS + metadata tokens +
per-bar chord/HOLD/NC tokens + BAR + EOS.  detokenize_to_period() is the
exact inverse, returning a ChordPeriod in canonical key.  The m(add9)
quality maps to QUAL_m_add9 in the vocab (parentheses not valid in token
names) via _qual_token/_token_qual helpers.

36 new tests cover vocabulary integrity, token sequence structure,
and full round-trip fidelity for all four valid fixture files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:47:28 +03:00