Ashkboos's picture

1 1

Ashkboos

Saleh

·

https://sashkboos.github.io

AI & ML interests

ML/DL/HPC

Organizations

None yet

authored 5 papers about 1 year ago

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 8

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Paper • 2306.03078 • Published Jun 5, 2023 • 3

Towards End-to-end 4-Bit Inference on Generative Large Language Models

Paper • 2310.09259 • Published Oct 13, 2023 • 1

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

Paper • 2404.00456 • Published Mar 30, 2024 • 4

STen: Productive and Efficient Sparsity in PyTorch

Paper • 2304.07613 • Published Apr 15, 2023

authored a paper over 1 year ago

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 75