data: add fine-tuning run results (lr=1e-5, 50 epochs)
val loss 1.24 → 0.80, val perplexity 3.47 → 2.22. Best epoch 50 (no early stop); convergence epoch 30. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Binary file not shown.
Reference in New Issue
Block a user