Instructions to use google/gemma-4-E2B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-E2B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-E2B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-E2B-it") - Notebooks
- Google Colab
- Kaggle
regarding of supported image token budgets
#20
by J22 - opened
The supported token budgets are: 70, 140, 280, 560, and 1120.
Theoretically (mathematically), the reference model implementation in transformers could support any image size, with at least 3x3 patches and up to
10240x10240 (position_embedding_size) patches (~11648569 LLM tokens).
So, my questions are:
- Are other token budgets (such as 100) truly not supported?
- Is
position_embedding_tablefully trained?