EdBerg's picture
End of training
682bc3c verified
|
raw
history blame
2.09 kB
metadata
library_name: transformers
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-base-samsum-tiny
    results: []

flan-t5-base-samsum-tiny

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5151
  • Rouge1: 47.0543
  • Rouge2: 23.0375
  • Rougel: 39.1234
  • Rougelsum: 42.8088
  • Gen Len: 17.68

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 13 1.5191 46.3463 23.0242 39.6192 42.4292 16.76
No log 2.0 26 1.5157 46.7915 22.7085 39.3789 42.7189 17.26
No log 3.0 39 1.5151 47.0543 23.0375 39.1234 42.8088 17.68
No log 4.0 52 1.5185 46.4955 22.6122 38.2357 42.3003 17.57
No log 5.0 65 1.5198 46.5658 22.6509 38.3667 42.5244 17.59

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1