ssc-aln-mms-model-initadapt

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
2.326	0.2851	200	1.3406	0.3085	0.7861
1.9291	0.5702	400	1.1932	0.2523	0.6399
1.8537	0.8553	600	1.2253	0.2373	0.6090
1.7843	1.1397	800	1.2338	0.2341	0.6036
1.7958	1.4248	1000	1.2025	0.2308	0.5951
1.6595	1.7099	1200	1.1076	0.2314	0.5908
1.701	1.9950	1400	1.0804	0.2230	0.5723
1.7293	2.2794	1600	1.0814	0.2222	0.5654
1.6819	2.5645	1800	1.0562	0.2216	0.5635
1.648	2.8496	2000	1.0537	0.2185	0.5530
1.6319	3.1340	2200	1.0287	0.2205	0.5531
1.6095	3.4191	2400	1.0638	0.2146	0.5456
1.5876	3.7042	2600	1.0850	0.2112	0.5399
1.5731	3.9893	2800	1.0683	0.2114	0.5432
1.6124	4.2737	3000	1.0107	0.2240	0.5636
1.5419	4.5588	3200	1.0475	0.2084	0.5336
1.619	4.8439	3400	1.0426	0.2080	0.5304
1.5642	5.1283	3600	1.0268	0.2093	0.5349
1.608	5.4134	3800	1.0665	0.2064	0.5299
1.5029	5.6985	4000	1.0168	0.2098	0.5318
1.5463	5.9836	4200	1.0136	0.2083	0.5257
1.5153	6.2680	4400	1.0077	0.2110	0.5293
1.54	6.5531	4600	1.0324	0.2055	0.5248
1.5261	6.8382	4800	1.0372	0.2044	0.5205
1.4758	7.1226	5000	1.0717	0.2036	0.5200
1.5382	7.4077	5200	1.0106	0.2039	0.5174
1.4944	7.6928	5400	1.0455	0.2036	0.5150
1.547	7.9779	5600	1.0068	0.2059	0.5228
1.4804	8.2623	5800	1.0060	0.2034	0.5149
1.5419	8.5474	6000	1.0090	0.2045	0.5183
1.4761	8.8325	6200	1.0236	0.2026	0.5121
1.4934	9.1169	6400	1.0209	0.2030	0.5129
1.4594	9.4020	6600	1.0235	0.2030	0.5134
1.4198	9.6871	6800	1.0239	0.2027	0.5127
1.4721	9.9722	7000	1.0167	0.2033	0.5139

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(358)

this model