--- library_name: exllamav3 license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507/blob/main/LICENSE pipeline_tag: text-generation base_model: Qwen/Qwen3-235B-A22B-Thinking-2507 base_model_relation: quantized tags: - exl3 --- Exllamav3 quantizations of [Qwen/Qwen3-235B-A22B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507) [2.10 bpw h6](https://huggingface.co/MikeRoz/Qwen3-235B-A22B-Thinking-2507-exl3/tree/2.10bpw_H6) 59.287 GiB [2.80 bpw h6](https://huggingface.co/MikeRoz/Qwen3-235B-A22B-Thinking-2507-exl3/tree/2.80bpw_H6) 78.295 GiB [3.60 bpw h6](https://huggingface.co/MikeRoz/Qwen3-235B-A22B-Thinking-2507-exl3/tree/3.60bpw_H6) 100.116 GiB [4.25 bpw h6](https://huggingface.co/MikeRoz/Qwen3-235B-A22B-Thinking-2507-exl3/tree/4.25bpw_H6) 117.803 GiB * The 2.10 bpw quant will fit in three 24 GB cards with 45k of context. * The 2.80 bpw quant will fit in four 24 GB cards with 57k of context. * The 3.60 bpw quant will fit in five 24 GB cards with 57k of context. * The 4.25 bpw quant will fit in six 24 GB cards with 73k of context.