Internal: Blas GEMM launch failed when running classifier for URLs
loginName1 opened this issue · comments
System information:
- os: Windows 11
- gpu: Nvidia GeForce RTX 3080 TI (12GB)
- Tensor flow: tensorflow-gpu v 1.14.0
- cuda: v 10.0 (but i have other version installed: 12.4 and 9.0 which dont have the required .dll file, all of them are in my PATH)
- python: 3.7 (for the purposes of protobuf)
The error:
2024-04-22 15:35:14.641625: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(4096, 2), b.shape=(2, 768), m=4096, n=768, k=2
[[node bert/embeddings/MatMul (defined at D:\Faks\UM-Mag 23-25\Drugi semester\JT\google-bert\modeling.py:487) ]]
[[loss/Mean/_4861]]
(1) Internal: Blas GEMM launch failed : a.shape=(4096, 2), b.shape=(2, 768), m=4096, n=768, k=2
[[node bert/embeddings/MatMul (defined at D:\Faks\UM-Mag 23-25\Drugi semester\JT\google-bert\modeling.py:487) ]]
0 successful operations.
0 derived errors ignored.
What i've tried:
I tried checking nvidia-smi.exe
to see if i had something running on the GPU while training but got the following result:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.22 Driver Version: 552.22 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti WDDM | 00000000:0A:00.0 Off | N/A |
| 0% 36C P8 24W / 350W | 1598MiB / 12288MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 9072 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 10652 C+G ...al\Discord\app-1.0.9042\Discord.exe N/A |
| 0 N/A N/A 10688 C+G ...ekyb3d8bbwe\PhoneExperienceHost.exe N/A |
| 0 N/A N/A 10820 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 0 N/A N/A 10844 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 0 N/A N/A 14572 C+G ...\cef\cef.win7x64\steamwebhelper.exe N/A |
| 0 N/A N/A 14744 C+G ...GeForce Experience\NVIDIA Share.exe N/A |
| 0 N/A N/A 14824 C+G ...1.0_x64__8wekyb3d8bbwe\Video.UI.exe N/A |
| 0 N/A N/A 15072 C+G ...t.LockApp_cw5n1h2txyewy\LockApp.exe N/A |
| 0 N/A N/A 17164 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 18100 C+G ...les\Microsoft OneDrive\OneDrive.exe N/A |
| 0 N/A N/A 18936 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 N/A N/A 19308 C+G ...air\Corsair iCUE5 Software\iCUE.exe N/A |
| 0 N/A N/A 19848 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 0 N/A N/A 23840 C+G ..._x64__kzf8qxf38zg5c\Skype\Skype.exe N/A |
| 0 N/A N/A 24040 C+G ...lf\0.248.120.19\OverwolfBrowser.exe N/A |
| 0 N/A N/A 25276 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 0 N/A N/A 25604 C+G ...ejd91yc\AdobeNotificationClient.exe N/A |
| 0 N/A N/A 25636 C+G ...509_x64__8wekyb3d8bbwe\ms-teams.exe N/A |
| 0 N/A N/A 25920 C+G ...ktop\EA Desktop\EACefSubProcess.exe N/A |
| 0 N/A N/A 25972 C+G ...\GOG Galaxy\GalaxyClient Helper.exe N/A |
| 0 N/A N/A 26180 C+G ...EA Desktop\EA Desktop\EADesktop.exe N/A |
| 0 N/A N/A 28536 C+G ...cks-services\BlueStacksServices.exe N/A |
| 0 N/A N/A 29332 C+G ...aam7r\AcrobatNotificationClient.exe N/A |
| 0 N/A N/A 31120 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 0 N/A N/A 31468 C+G ..._x64__kzf8qxf38zg5c\Skype\Skype.exe N/A |
| 0 N/A N/A 32340 C+G ...m Files\Mozilla Firefox\firefox.exe N/A |
| 0 N/A N/A 33104 C+G ...on\HEX\Creative Cloud UI Helper.exe N/A |
| 0 N/A N/A 33208 C+G ...on\123.0.2420.97\msedgewebview2.exe N/A |
| 0 N/A N/A 35396 C+G ...wekyb3d8bbwe\XboxGameBarWidgets.exe N/A |
| 0 N/A N/A 39264 C+G ...m Files\Mozilla Firefox\firefox.exe N/A |
+-----------------------------------------------------------------------------------------+
Then i tried googling for other similar issues and found this and when following this answers instructions, adding the lines to the modeling.py
after the imports i received the same error.
I didn't find any other possible solutions and i'm unsure as to what i'm doing wrong. Did i add the memory growth lines in the wrong file or did i go about solving the issue completely wrong? Any help is appreciated.
I am running the 3.7 kernel in an virtual environment and the data i am feeding the model is properly formatted. I am using the BERT base uncased model downloaded from this repository.