Skip to content

Conversation

@silvaxxx1
Copy link

Addresses build errors from deprecated CUDA flags

Changes Proposed

  • Updated CMake configuration to use GGML_CUDA instead of deprecated LLAMA_CUBLAS
  • Fixed paths for conversion script and quantize binary
  • Added proper error handling for model downloads
  • Updated documentation for build requirements

Testing Performed

  • Verified build success on Ubuntu 22.04 with CUDA 12.1
  • Tested full quantization workflow with EvolCodeLlama-7b
  • Confirmed GPU acceleration working with nvidia-smi monitoring

Notes for Reviewers

  • Requires CUDA toolkit 11.x-12.x
  • Tested with Python 3.10
  • Added dependency on git-lfs in documentation
- Replace deprecated LLAMA_CUBLAS with GGML_CUDA
- Update conversion script to convert-hf-to-gguf.py
- Fix quantize binary path
- Add error handling for model downloads
- Replace deprecated LLAMA_CUBLAS with GGML_CUDA
- Update conversion script to convert-hf-to-gguf.py
- Fix quantize binary path
- Add error handling for model downloads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant