the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders

Home Page:https://huggingface.co/spaces/mike-ravkine/can-ai-code-results

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

2bit llama70b with quip-sharp

the-crypt-keeper opened this issue · comments

  • Requires CUDA12.
  • Attempting to use nvcr.io/nvidia/pytorch:23.06-py3 as the base, but something is wrong with transformers-engine inside that image and it crashes on load
  • pip installing git+https://github.com/NVIDIA/TransformerEngine.git@main fixes the crash
  • the model loads, but generate call never returns

Closing out all old 2-bit quants.