Instructions to use Andrija/SRoBERTa with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Andrija/SRoBERTa with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="Andrija/SRoBERTa")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("Andrija/SRoBERTa") model = AutoModelForMaskedLM.from_pretrained("Andrija/SRoBERTa") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -13,4 +13,21 @@ license: apache-2.0
|
|
| 13 |
---
|
| 14 |
# Transformer language model for Croatian and Serbian
|
| 15 |
Trained on 0.7GB dataset Croatian and Serbian language for one epoch.
|
| 16 |
-
Dataset from Leipzig Corpora.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
# Transformer language model for Croatian and Serbian
|
| 15 |
Trained on 0.7GB dataset Croatian and Serbian language for one epoch.
|
| 16 |
+
Dataset from Leipzig Corpora.
|
| 17 |
+
|
| 18 |
+
# Information of dataset
|
| 19 |
+
| Model | #params | Arch. | Training data |
|
| 20 |
+
|
| 21 |
+
|--------------------------------|--------------------------------|-------|-----------------------------------|
|
| 22 |
+
|
| 23 |
+
| `Andrija/SRoBERTa` | 120M | First | Leipzig Corpus (0.7 GB of text) |
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
# How to use in code
|
| 27 |
+
```python
|
| 28 |
+
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
| 29 |
+
|
| 30 |
+
tokenizer = AutoTokenizer.from_pretrained("Andrija/SRoBERTa")
|
| 31 |
+
|
| 32 |
+
model = AutoModelForMaskedLM.from_pretrained("Andrija/SRoBERTa")
|
| 33 |
+
```
|