hmueller25
/

long-t5-tglobal-base-german-law

text2text-generation

Generated from Trainer

Model card Files Files and versions Community

hmueller25 commited on Jun 7

Commit

e2ed4bb

·

verified ·

1 Parent(s): 5415daa

End of training

Files changed (2) hide show

README.md +10 -10
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -18,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.3280
-- Rouge1: 0.0693
-- Rouge2: 0.0202
-- Rougel: 0.0626
-- Rougelsum: 0.0633
 - Gen Len: 20.0
 ## Model description
@@ -44,7 +44,7 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 1
-- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 8
@@ -56,10 +56,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
-| 5.5906        | 1.0   | 49   | 9.4417          | 0.0375 | 0.0103 | 0.0369 | 0.0377    | 20.0    |
-| 4.6503        | 2.0   | 98   | 3.7947          | 0.0634 | 0.0206 | 0.0555 | 0.0568    | 20.0    |
-| 3.6625        | 3.0   | 147  | 3.4385          | 0.072  | 0.0236 | 0.0631 | 0.0631    | 20.0    |
-| 3.1628        | 4.0   | 196  | 3.3280          | 0.0693 | 0.0202 | 0.0626 | 0.0633    | 20.0    |
 ### Framework versions

 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.4575
+- Rouge1: 0.072
+- Rouge2: 0.0236
+- Rougel: 0.0631
+- Rougelsum: 0.0631
 - Gen Len: 20.0
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 1
+- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 8
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
+| 5.5906        | 1.0   | 49   | 9.5684          | 0.0375 | 0.0103 | 0.0369 | 0.0377    | 20.0    |
+| 4.6503        | 2.0   | 98   | 3.8125          | 0.0634 | 0.0206 | 0.0555 | 0.0568    | 20.0    |
+| 3.6625        | 3.0   | 147  | 3.4575          | 0.072  | 0.0236 | 0.0631 | 0.0631    | 20.0    |
+| 3.1628        | 4.0   | 196  | 3.3466          | 0.0693 | 0.0202 | 0.0626 | 0.0633    | 20.0    |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:15b7f688401128034d4fc70529d76ef14368b3a66ff784083249514b310ee951
 size 1187780840

 version https://git-lfs.github.com/spec/v1
+oid sha256:33c1928b8ae64bf0067ca48549141ac050634619c9276d47ab74bf2868d1f7e8
 size 1187780840