gptqmodel

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install gptqmodel

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Files
2.1.0 2025-03-13    
2.0.0 2025-03-03    
1.9.0 2025-02-12    
1.8.1 2025-02-08    
1.8.0 2025-02-07    
1.7.4 2025-01-26    
1.7.3 2025-01-21    
1.7.2 2025-01-19    
1.7.0 2025-01-17    
1.6.0 2025-01-06    
1.5.1 2025-01-01    
1.5.0 2024-12-24    
1.4.5 2024-12-19    
1.4.4 2024-12-17    
1.4.3 2024-12-17    
1.4.2 2024-12-16    
1.4.1 2024-12-13    
1.4.0 2024-12-10    
1.3.1 2024-11-29    
1.3.0 2024-11-27    
1.2.3 2024-11-25    
1.2.1 2024-11-11    
1.2.1.dev0 pre-release 2024-11-11    
1.2.0 2024-11-11    
1.1.0 2024-10-29    
1.0.9 2024-10-13    
1.0.8 2024-10-11    
1.0.7 2024-10-08    
1.0.6 2024-09-26    
1.0.5 2024-09-26    
1.0.4 2024-09-26    
1.0.3 2024-09-19    
1.0.2 2024-08-17    
1.0.1 2024-08-15    

Issues with this package?

Page last updated 2025-03-13 18:49:08 UTC