d9585ec008
val loss 1.19 → 0.77, val perplexity 3.29 → 2.15. Best epoch 20, early stop at epoch 30 (patience=10). Improvement over previous lr=1e-5 run (best val ppl 2.22). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
48 lines
2.1 KiB
Plaintext
48 lines
2.1 KiB
Plaintext
|
|
====================================================
|
|
FINE-TUNING REPORT
|
|
====================================================
|
|
Total epochs run : 30
|
|
Best epoch (val loss) : 20
|
|
Convergence epoch : 15 (val ≤ best+1 %)
|
|
Best val loss : 0.7668
|
|
Best val perplexity : 2.15
|
|
Final train loss : 0.5379
|
|
Unique parameters : 1,396,416
|
|
Checkpoint : checkpoints/finetuned.pt
|
|
Log CSV : checkpoints/finetuned.log.csv
|
|
====================================================
|
|
|
|
epoch train val ppl lr
|
|
----- -------- -------- ------- ----------
|
|
1 1.1733 1.1905 3.29 2.40e-05
|
|
2 1.0317 1.0171 2.77 3.00e-05
|
|
3 0.9005 0.9146 2.50 2.99e-05
|
|
4 0.8265 0.8777 2.41 2.98e-05
|
|
5 0.7970 0.8517 2.34 2.96e-05
|
|
6 0.7591 0.8368 2.31 2.93e-05
|
|
7 0.7370 0.8194 2.27 2.90e-05
|
|
8 0.7228 0.8066 2.24 2.86e-05
|
|
9 0.6933 0.7973 2.22 2.82e-05
|
|
10 0.6833 0.7923 2.21 2.77e-05
|
|
11 0.6731 0.7879 2.20 2.71e-05
|
|
12 0.6559 0.7830 2.19 2.65e-05
|
|
13 0.6432 0.7776 2.18 2.59e-05
|
|
14 0.6360 0.7746 2.17 2.52e-05
|
|
15 0.6307 0.7731 2.17 2.45e-05
|
|
16 0.6225 0.7715 2.16 2.37e-05
|
|
17 0.6069 0.7695 2.16 2.29e-05
|
|
18 0.6011 0.7682 2.16 2.21e-05
|
|
19 0.6019 0.7682 2.16 2.12e-05
|
|
20 0.5804 0.7668 2.15 2.03e-05 ←
|
|
21 0.5749 0.7675 2.15 1.94e-05
|
|
22 0.5770 0.7696 2.16 1.85e-05
|
|
23 0.5672 0.7710 2.16 1.75e-05
|
|
24 0.5646 0.7712 2.16 1.66e-05
|
|
25 0.5569 0.7723 2.16 1.56e-05
|
|
26 0.5561 0.7710 2.16 1.46e-05
|
|
27 0.5515 0.7691 2.16 1.37e-05
|
|
28 0.5428 0.7685 2.16 1.27e-05
|
|
29 0.5428 0.7702 2.16 1.18e-05
|
|
30 0.5379 0.7711 2.16 1.08e-05
|