refactor: replace fixed STYLE_user with open-ended style tag system

- STYLE_user renamed to STYLE_H1K0 in VOCAB (author's personal tag)
- Style field now accepts any [A-Za-z][A-Za-z0-9_]* identifier in .chord files
- Unknown styles fall back to STYLE_other at tokenization time with a log warning
- Test fixtures updated to style: other; drop closed _VALID_STYLES frozenset
- Spec bumped to v2.1: documents open style field, fallback behaviour, and §5.7
  guide on registering a new style token

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-20 00:29:52 +03:00
parent 84ba7b4743
commit 4fd8ece170
12 changed files with 60 additions and 38 deletions
+1 -1
View File
@@ -21,7 +21,7 @@ def _write_pt(tmp_path: Path, stem: str, n_tokens: int) -> Path:
"""Write a dummy .pt file with sequential token IDs."""
tokens = torch.arange(n_tokens, dtype=torch.long)
path = tmp_path / f"{stem}.pt"
torch.save({"tokens": tokens, "meta": {"style": "user", "function": "verse"}}, path)
torch.save({"tokens": tokens, "meta": {"style": "other", "function": "verse"}}, path)
return path