Instructions to use MU-NLPC/CzeGPT-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MU-NLPC/CzeGPT-2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MU-NLPC/CzeGPT-2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("MU-NLPC/CzeGPT-2") model = AutoModelForCausalLM.from_pretrained("MU-NLPC/CzeGPT-2") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MU-NLPC/CzeGPT-2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MU-NLPC/CzeGPT-2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MU-NLPC/CzeGPT-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/MU-NLPC/CzeGPT-2
- SGLang
How to use MU-NLPC/CzeGPT-2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MU-NLPC/CzeGPT-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MU-NLPC/CzeGPT-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MU-NLPC/CzeGPT-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MU-NLPC/CzeGPT-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use MU-NLPC/CzeGPT-2 with Docker Model Runner:
docker model run hf.co/MU-NLPC/CzeGPT-2
CzeGPT-2
CzeGPT-2 is a Czech version of GPT-2 language model by OpenAI with LM Head on top. The model has the same architectural dimensions as the GPT-2 small (12 layers, 12 heads, 1024 tokens on input/output, and embedding vectors with 768 dimensions) resulting in 124 M trainable parameters. It was trained on a 5 GB slice of cleaned csTenTen17 dataset.
The model is a good building block for any down-stream task requiring autoregressive text generation.
Tokenizer
Along, we also provide a tokenizer (vocab and merges) with vocab size of 50257 that was used during the pre-training phase. It is the byte-level BPE tokenizer used in the original paper and was trained on the whole 5 GB train set.
Training results
The model's perplexity on a 250 MB random slice of csTenTen17 dataset is 42.12. This value is unfortunately not directly comparable to any other model, since there is no competition in Czech autoregressive models yet (and comparison with models for other languages is meaningless, because of different tokenization and test data).
Running the predictions
The repository includes a simple Jupyter Notebook that can help with the first steps when using the model.
How to cite
Hájek A. and Horák A. CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task. IEEE Access, vol. 12, 34570–34581, Elsevier, 2024. https://doi.org/10.1109/ACCESS.2024.3371689
@article{hajek_horak2024,
author = "Adam Hájek and Aleš Horák",
title = "CzeGPT-2 -- Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task",
journal= "IEEE Access",
year = "2024",
volume = "12",
pages = "34570--34581",
doi = "10.1109/ACCESS.2024.3371689",
}
- Downloads last month
- 1,545
docker model run hf.co/MU-NLPC/CzeGPT-2