
====================================================
  PRE-TRAINING REPORT
====================================================
  Total epochs run      : 50
  Best epoch (val loss) : 49
  Convergence epoch     : 42  (val ≤ best+1 %)
  Best val loss         : 0.2785
  Best val perplexity   : 1.32
  Final train loss      : 0.2539
  Unique parameters     : 1,396,416
  Checkpoint            : checkpoints/pretrained.pt
  Log CSV               : checkpoints/pretrained.log.csv
====================================================

  epoch     train       val      ppl          lr
  -----  --------  --------  -------  ----------
      1    2.0431    0.8604     2.36    2.20e-04
      2    0.6824    0.5873     1.80    3.00e-04
      3    0.5679    0.5449     1.72    2.99e-04
      4    0.5294    0.5129     1.67    2.98e-04
      5    0.5054    0.4908     1.63    2.96e-04
      6    0.4849    0.4717     1.60    2.93e-04
      7    0.4671    0.4569     1.58    2.90e-04
      8    0.4502    0.4428     1.56    2.86e-04
      9    0.4359    0.4285     1.53    2.82e-04
     10    0.4256    0.4201     1.52    2.77e-04
     11    0.4148    0.4112     1.51    2.72e-04
     12    0.4055    0.4097     1.51    2.66e-04
     13    0.3969    0.3919     1.48    2.60e-04
     14    0.3876    0.3873     1.47    2.53e-04
     15    0.3791    0.3851     1.47    2.45e-04
     16    0.3717    0.3745     1.45    2.38e-04
     17    0.3645    0.3673     1.44    2.30e-04
     18    0.3574    0.3645     1.44    2.21e-04
     19    0.3503    0.3585     1.43    2.13e-04
     20    0.3430    0.3498     1.42    2.04e-04
     21    0.3377    0.3438     1.41    1.95e-04
     22    0.3308    0.3370     1.40    1.85e-04
     23    0.3248    0.3323     1.39    1.76e-04
     24    0.3194    0.3249     1.38    1.66e-04
     25    0.3141    0.3215     1.38    1.57e-04
     26    0.3098    0.3177     1.37    1.47e-04
     27    0.3043    0.3134     1.37    1.37e-04
     28    0.3000    0.3108     1.36    1.28e-04
     29    0.2950    0.3072     1.36    1.18e-04
     30    0.2901    0.3034     1.35    1.09e-04
     31    0.2880    0.3020     1.35    9.95e-05
     32    0.2835    0.2993     1.35    9.05e-05
     33    0.2805    0.2948     1.34    8.17e-05
     34    0.2759    0.2919     1.34    7.32e-05
     35    0.2737    0.2888     1.33    6.51e-05
     36    0.2706    0.2878     1.33    5.73e-05
     37    0.2679    0.2865     1.33    4.98e-05
     38    0.2660    0.2848     1.33    4.28e-05
     39    0.2645    0.2837     1.33    3.63e-05
     40    0.2623    0.2827     1.33    3.02e-05
     41    0.2608    0.2822     1.33    2.46e-05
     42    0.2589    0.2807     1.32    1.96e-05
     43    0.2579    0.2802     1.32    1.51e-05
     44    0.2568    0.2794     1.32    1.11e-05
     45    0.2549    0.2793     1.32    7.75e-06
     46    0.2556    0.2789     1.32    4.98e-06
     47    0.2550    0.2787     1.32    2.81e-06
     48    0.2543    0.2786     1.32    1.25e-06
     49    0.2524    0.2785     1.32    3.13e-07 ←
     50    0.2539    0.2785     1.32    0.00e+00
