sayhan
/

gemma-7b-GGUF-quantized

Text Generation

Model card Files Files and versions

sayhan commited on Feb 23, 2024

Commit

ceab941

·

verified ·

1 Parent(s): 65c2f88

Create README.md

Files changed (1) hide show

README.md +53 -0

README.md ADDED Viewed

	@@ -0,0 +1,53 @@

+---
+base_model: google/gemma-7b
+language:
+- en
+pipeline_tag: text-generation
+license: other
+model_type: gemma
+library_name: transformers
+inference: false
+---
+![image/webp](https://cdn-uploads.huggingface.co/production/uploads/65aa2d4b356bf23b4a4da247/NQAvp6NRHlNILyWWFlrA7.webp)
+## Google Gemma 7B
+- **Model creator:** [Google](https://huggingface.co/google)
+- **Original model:** [gemma-7b-it](https://huggingface.co/google/gemma-7b)
+- [**Terms of use**](https://www.kaggle.com/models/google/gemma/license/consent)
+<!-- description start -->
+## Description
+This repo contains GGUF format model files for [Google's Gemma 7B](https://huggingface.co/google/gemma-7b)
+## Original model
+- **Developed by:** [Google](https://huggingface.co/google)
+### Description
+Gemma is a family of lightweight, state-of-the-art open models from Google,
+built from the same research and technology used to create the Gemini models.
+They are text-to-text, decoder-only large language models, available in English,
+with open weights, pre-trained variants, and instruction-tuned variants. Gemma
+models are well-suited for a variety of text generation tasks, including
+question answering, summarization, and reasoning. Their relatively small size
+makes it possible to deploy them in environments with limited resources such as
+a laptop, desktop or your own cloud infrastructure, democratizing access to
+state of the art AI models and helping foster innovation for everyone.
+## Quantizon types
+| quantization method | bits | size     | description                                            | recommended |
+|---------------------|------|----------|-----------------------------------------------------|-------------|
+| Q2_K                | 2    | 3.09     | very small, very high quality loss  |  ❌  |
+| Q3_K_S              | 3    | 3.68 GB  | very small, high quality loss                       | ❌         |
+| Q3_K_L              | 3    | 4.4 GB  | small, substantial quality loss                     | ❌         |
+| Q4_0                | 4    | 4.81 GB  | legacy; small, very high quality loss | ❌         |
+| Q4_K_S              | 4    | 4.84 GB  | medium, balanced quality   |  ✅  |
+| Q4_K_M              | 4    | 5.13 GB  | medium, balanced quality              | ✅         |
+| Q5_0                | 5    | 5.88 GB  | legacy; medium, balanced quality  | ❌         |
+| Q5_K_S              | 5    | 5.88 GB  | large, low quality loss | ✅         |
+| Q5_K_M              | 5    | 6.04 GB  | large, very low quality loss | ✅         |
+| Q6_K                | 6    | 7.01 GB  | very large, extremely low quality loss              | ❌         |
+| Q8_0                | 8    | 9.08 GB  | very large, extremely low quality loss | ❌         |
+| FP16                | 16   | 17.1 GB  | enormous, negligible quality loss |  ❌  |
+## Usage
+You can use this model with the latest builds of **LM Studio** and **llama.cpp**.
+If you're new to the world of _large language models_, I recommend starting with **LM Studio**.
+<!-- description end -->