==================================================== PRE-TRAINING REPORT ==================================================== Total epochs run : 50 Best epoch (val loss) : 49 Convergence epoch : 42 (val ≤ best+1 %) Best val loss : 0.2785 Best val perplexity : 1.32 Final train loss : 0.2539 Unique parameters : 1,396,416 Checkpoint : checkpoints/pretrained.pt Log CSV : checkpoints/pretrained.log.csv ==================================================== epoch train val ppl lr ----- -------- -------- ------- ---------- 1 2.0431 0.8604 2.36 2.20e-04 2 0.6824 0.5873 1.80 3.00e-04 3 0.5679 0.5449 1.72 2.99e-04 4 0.5294 0.5129 1.67 2.98e-04 5 0.5054 0.4908 1.63 2.96e-04 6 0.4849 0.4717 1.60 2.93e-04 7 0.4671 0.4569 1.58 2.90e-04 8 0.4502 0.4428 1.56 2.86e-04 9 0.4359 0.4285 1.53 2.82e-04 10 0.4256 0.4201 1.52 2.77e-04 11 0.4148 0.4112 1.51 2.72e-04 12 0.4055 0.4097 1.51 2.66e-04 13 0.3969 0.3919 1.48 2.60e-04 14 0.3876 0.3873 1.47 2.53e-04 15 0.3791 0.3851 1.47 2.45e-04 16 0.3717 0.3745 1.45 2.38e-04 17 0.3645 0.3673 1.44 2.30e-04 18 0.3574 0.3645 1.44 2.21e-04 19 0.3503 0.3585 1.43 2.13e-04 20 0.3430 0.3498 1.42 2.04e-04 21 0.3377 0.3438 1.41 1.95e-04 22 0.3308 0.3370 1.40 1.85e-04 23 0.3248 0.3323 1.39 1.76e-04 24 0.3194 0.3249 1.38 1.66e-04 25 0.3141 0.3215 1.38 1.57e-04 26 0.3098 0.3177 1.37 1.47e-04 27 0.3043 0.3134 1.37 1.37e-04 28 0.3000 0.3108 1.36 1.28e-04 29 0.2950 0.3072 1.36 1.18e-04 30 0.2901 0.3034 1.35 1.09e-04 31 0.2880 0.3020 1.35 9.95e-05 32 0.2835 0.2993 1.35 9.05e-05 33 0.2805 0.2948 1.34 8.17e-05 34 0.2759 0.2919 1.34 7.32e-05 35 0.2737 0.2888 1.33 6.51e-05 36 0.2706 0.2878 1.33 5.73e-05 37 0.2679 0.2865 1.33 4.98e-05 38 0.2660 0.2848 1.33 4.28e-05 39 0.2645 0.2837 1.33 3.63e-05 40 0.2623 0.2827 1.33 3.02e-05 41 0.2608 0.2822 1.33 2.46e-05 42 0.2589 0.2807 1.32 1.96e-05 43 0.2579 0.2802 1.32 1.51e-05 44 0.2568 0.2794 1.32 1.11e-05 45 0.2549 0.2793 1.32 7.75e-06 46 0.2556 0.2789 1.32 4.98e-06 47 0.2550 0.2787 1.32 2.81e-06 48 0.2543 0.2786 1.32 1.25e-06 49 0.2524 0.2785 1.32 3.13e-07 ← 50 0.2539 0.2785 1.32 0.00e+00