could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
#10
by superahn - opened
could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
superahn changed discussion title from could you give me a reason why ignore kv_a_proj_with_mqa layer when quantizing this model? to could you give me a reason why you ignore kv_a_proj_with_mqa layer when quantizing this model?
Because some kernel implementation of AWQ only support dim divisible by 128, while that layer has a dim of 64.