Supporting quantized (GGUF) models for resource constraint environments #17415
bhupesh-sf
started this conversation in
Ideas & Features
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey,
Currently the repo uses only safetensor models and I was wondering if we can support quantized (GGUF) models for resource constraint environments. I had to deploy use it in a an CPU only env with low resources and it takes lots memory for such environment.
Beta Was this translation helpful? Give feedback.
All reactions