mistralai / mistral-inference

Official inference library for Mistral models

Home Page:https://mistral.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA EXTENSION NOT INSTALLED nvcr.io/nvidia/pytorch:22.12-py3

skr3178 opened this issue · comments

Also wrote on AutoGPTQ/AutoGPTQ#598

!nvidia-smi`

Sun Mar 17 12:04:03 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 On | N/A |
| 0% 47C P8 24W / 170W | 418MiB / 12288MiB | 14% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
!nvcc --version
!nvcc --version
!nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0

import torch

print(torch.__version__)

2.1.0a0+fe05266

import torch 
print(torch.version.cuda)

12.1

!pip list

Package Version


absl-py 1.4.0
accelerate 0.28.0
aiohttp 3.8.4
aiosignal 1.3.1
apex 0.1
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
asttokens 2.2.1
astunparse 1.6.3
async-timeout 4.0.2
attrs 22.2.0
audioread 3.0.0
auto-gptq 0.6.0
backcall 0.2.0
beautifulsoup4 4.12.2
bitsandbytes 0.43.0
bleach 6.0.0
blis 0.7.9
cachetools 5.3.0
catalogue 2.0.8
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 3.1.0
click 8.1.3
cloudpickle 2.2.1
cmake 3.24.1.1
coloredlogs 15.0.1
comm 0.1.3
confection 0.0.4
contourpy 1.0.7
cubinlinker 0.2.2+2.g5f51201
cuda-python 12.1.0rc5+1.g808384c
cudf 23.2.0
cugraph 23.2.0
cugraph-dgl 23.2.0
cugraph-service-client 23.2.0
cugraph-service-server 23.2.0
cuml 23.2.0
cupy-cuda12x 12.0.0b3
cycler 0.11.0
cymem 2.0.7
Cython 0.29.34
dask 2023.1.1
dask-cuda 23.2.0
dask-cudf 23.2.0
datasets 2.18.0
debugpy 1.6.7
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.8
distributed 2023.1.1
exceptiongroup 1.1.1
execnet 1.9.0
executing 1.2.0
expecttest 0.1.3
fastjsonschema 2.16.3
fastrlock 0.8.1
filelock 3.11.0
flash-attn 0.2.8.dev0
fonttools 4.39.3
frozenlist 1.3.3
fsspec 2024.2.0
gast 0.4.0
gekko 1.0.7
google-auth 2.17.3
google-auth-oauthlib 0.4.6
graphsurgeon 0.4.6
grpcio 1.53.0
HeapDict 1.0.1
huggingface-hub 0.21.4
humanfriendly 10.0
hypothesis 5.35.1
idna 3.4
importlib-metadata 6.3.0
importlib-resources 5.12.0
iniconfig 2.0.0
intel-openmp 2021.4.0
ipykernel 6.22.0
ipython 8.12.0
ipython-genutils 0.2.0
ipywidgets 8.1.2
jedi 0.18.2
Jinja2 3.1.2
joblib 1.2.0
json5 0.9.11
jsonschema 4.17.3
jupyter_client 8.2.0
jupyter_core 5.3.0
jupyter-tensorboard 0.2.0
jupyterlab 2.3.2
jupyterlab-pygments 0.2.2
jupyterlab-server 1.2.0
jupyterlab_widgets 3.0.10
jupytext 1.14.5
kiwisolver 1.4.4
langcodes 3.3.0
librosa 0.9.2
lit 16.0.1
llvmlite 0.39.1
locket 1.0.0
Markdown 3.4.3
markdown-it-py 2.2.0
MarkupSafe 2.1.2
matplotlib 3.7.1
matplotlib-inline 0.1.6
mdit-py-plugins 0.3.5
mdurl 0.1.2
mistune 2.0.5
mkl 2021.1.1
mkl-devel 2021.1.1
mkl-include 2021.1.1
mock 5.0.1
mpmath 1.3.0
msgpack 1.0.5
multidict 6.0.4
multiprocess 0.70.16
murmurhash 1.0.9
nbclient 0.7.3
nbconvert 7.3.1
nbformat 5.8.0
nest-asyncio 1.5.6
networkx 2.6.3
notebook 6.4.10
numba 0.56.4+1.g536eedd6e
numpy 1.22.2
nvidia-dali-cuda110 1.24.0
nvidia-pyindex 1.0.9
nvtx 0.2.5
oauthlib 3.2.2
onnx 1.13.1
opencv 4.6.0
optimum 1.17.1
packaging 23.0
pandas 1.5.2
pandocfilters 1.5.0
parso 0.8.3
partd 1.3.0
pathy 0.10.1
peft 0.9.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.2.0
pip 21.2.4
pkgutil_resolve_name 1.3.10
platformdirs 3.2.0
pluggy 1.0.0
ply 3.11
polygraphy 0.46.2
pooch 1.7.0
preshed 3.0.8
prettytable 3.7.0
prometheus-client 0.16.0
prompt-toolkit 3.0.38
protobuf 3.20.3
psutil 5.9.4
ptxcompiler 0.7.0+27.gb446f00
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 15.0.1
pyarrow-hotfix 0.6
pyasn1 0.4.8
pyasn1-modules 0.2.8
pybind11 2.10.4
pycocotools 2.0+nv0.7.1
pycparser 2.21
pydantic 1.10.7
Pygments 2.15.0
pylibcugraph 23.2.0
pylibcugraphops 23.2.0
pylibraft 23.2.0
pynvml 11.4.1
pyparsing 3.0.9
pyrsistent 0.19.3
pytest 7.3.1
pytest-rerunfailures 11.1.2
pytest-shard 0.1.2
pytest-xdist 3.2.1
python-dateutil 2.8.2
python-hostlist 1.23.0
pytorch-quantization 2.1.2
pytz 2023.3
PyYAML 6.0
pyzmq 25.0.2
raft-dask 23.2.0
regex 2023.3.23
requests 2.28.2
requests-oauthlib 1.3.1
resampy 0.4.2
rmm 23.2.0
rouge 1.0.1
rsa 4.9
safetensors 0.4.2
scikit-learn 1.2.0
scipy 1.10.1
seaborn 0.12.2
Send2Trash 1.8.0
sentencepiece 0.2.0
setuptools 65.5.1
six 1.16.0
smart-open 6.3.0
sortedcontainers 2.4.0
soundfile 0.12.1
soupsieve 2.4
spacy 3.5.2
spacy-legacy 3.0.12
spacy-loggers 1.0.4
sphinx-glpi-theme 0.3
srsly 2.4.6
stack-data 0.6.2
strings-udf 23.2.0
sympy 1.11.1
tbb 2021.9.0
tblib 1.7.0
tensorboard 2.9.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorrt 8.6.1
terminado 0.17.1
thinc 8.1.9
threadpoolctl 3.1.0
thriftpy2 0.4.16
tinycss2 1.2.1
tokenizers 0.15.2
toml 0.10.2
tomli 2.0.1
toolz 0.12.0
torch 2.1.0a0+fe05266
torch-tensorrt 1.4.0.dev0
torchtext 0.13.0a0+fae8e8c
torchvision 0.15.0a0
tornado 6.2
tqdm 4.65.0
traitlets 5.9.0
transformer-engine 0.7.0
transformers 4.38.2
treelite 3.1.0
treelite-runtime 3.1.0
triton 2.0.0
typer 0.7.0
types-dataclasses 0.6.6
typing_extensions 4.5.0
ucx-py 0.30.0
uff 0.6.9
urllib3 1.26.15
wasabi 1.1.1
wcwidth 0.2.6
webencodings 0.5.1
Werkzeug 2.2.3
wheel 0.40.0
widgetsnbextension 4.0.10
xdoctest 1.0.2
xgboost 1.7.1
xxhash 3.4.1
yarl 1.8.2
zict 2.2.0
zipp 3.15.0
WARNING: You are using pip version 21.2.4; however, version 24.0 is available.
You should consider upgrading via the '/usr/bin/python -m pip install --upgrade pip' command.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import prepare_model_for_kbit_training
from peft import LoraConfig, get_peft_model
from datasets import load_dataset
import transformers

model_name = "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             device_map="auto", # automatically figures out how to best use CPU + GPU for loading model
                                             trust_remote_code=False, # prevents running custom model files on your machine
                                             revision="main") # which version of model to use in repo

CUDA extension not installed.
CUDA extension not installed.