data: update pretrained checkpoint results (BAR-free tokenizer)

Re-run pre-training results with the corrected 84-token vocabulary and
max_seq_len=320.  Previous checkpoint was trained on stale data with BAR
tokens and a corrupted tokenizer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-20 14:28:00 +03:00
parent 4aead2ea20
commit 8a73394df9
3 changed files with 107 additions and 107 deletions
Binary file not shown.