Commit Graph

5 Commits

Author SHA1 Message Date
H1K0 f00a6c1b3a feat: add bigram repetition penalty to generate_period
Tracks ROOT-level bigrams (prev_root → curr_root) across chord-change events.
At each FREE position, subtracts penalty * count(prev→root) from ROOT logits,
capped at 3.0 to prevent NC/HOLD flooding at extreme values.

Practical range: 0.5 (mild, breaks loops after 2 occurrences) to 1.0
(aggressive). Default 0.0 keeps backward compatibility.

Added --repetition-penalty flag to scripts/generate.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 15:19:24 +03:00
H1K0 9e73fa5d32 feat: add --bars arg to control output length
generate_period() now accepts n_bars=N to stop after exactly N complete
bars. bars_completed is seeded from the prefix length so --bars counts
the full output, not just the generated tail.

scripts/generate.py exposes this as --bars (default: None = model decides).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 20:29:44 +03:00
H1K0 f6ce2a41d3 fix: support '.' and 'NC' in --prefix argument
_encode_prefix now handles hold ('.') and no-chord ('NC') tokens
alongside chord symbols, and returns (ids, n_positions) so that
pos_in_bar is tracked correctly regardless of token type.

Fixes ChordParseError when dots were passed in --prefix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 20:25:41 +03:00
H1K0 2a3eb1783a fix: fine-tune config and generator improvements
scripts/train.py: fix max_seq_len 256→320 (must match pretrained checkpoint);
increase epochs 15→50 and patience 5→10 to give the small corpus enough
gradient steps; reduce warmup 20→10 (was 22% of total steps).

scripts/generate.py: default to prepending the tonic chord when --prefix is
not given; add --no-tonic-anchor to opt out.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 10:15:48 +03:00
H1K0 e657d9edb5 feat: add generate module and CLI; fix tokenizer minor issues
src/generate.py: autoregressive generation with top-p sampling, grammar
masking (ROOT→QUAL→EXT→BASS; EOS only at bar boundary), key transposition,
and optional chord prefix.  Partial bars on context truncation are padded
with HOLDs rather than discarded.

scripts/generate.py: CLI wrapping generate_period — accepts mode, key,
time, subdivision, style, function, prefix, temperature, top-p, seed,
tempo; writes .chord and optional MIDI.

src/tokenizer.py: fix docstring vocab size (81→84); normalize redundant
BASS_<note>==root to no slash in _tokens_to_symbol.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 14:28:44 +03:00