AmpereComputing/gemma-3-1b-it-gguf
		
				1.0B
			• 
	
				Updated
					
				
				• 
					
					15
				
	
				
				
Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp