Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda: unknow error cuda_get_deviceinfo on line 535 #245

Open
berezinevgeniy opened this issue Mar 8, 2019 · 13 comments
Open

Cuda: unknow error cuda_get_deviceinfo on line 535 #245

berezinevgeniy opened this issue Mar 8, 2019 · 13 comments

Comments

@berezinevgeniy
Copy link

berezinevgeniy commented Mar 8, 2019

i have error message on start on GeeForce 610-620, 720 and xmrig-nvidia 2.13+with drivers 376.71-391.35 (latest for me)

  • ABOUT XMRig-NVIDIA/2.14.0 MSVC/2015
  • LIBS libuv/1.24.1 CUDA/9.0 OpenSSL/1.1.1a microhttpd/0.9.61
  • CPU Intel(R) Core(TM) i3-3110M CPU @ 2.40GHz x64 -AES
  • GPU #0 PCI:0000:01:00 GeForce 610M @ 950/900 MHz 32x10 8x25 arch:21 SMX :1
  • ALGO cryptonight, donate=5%
  • POOL please add the amd version #1 XXXXXXX:8888 variant auto
  • COMMANDS hashrate, health, pause, resume

GPU 0: unknown error
cuda_get_deviceinfo line 535
[2019-03-09 00:09:29] Setup failed for GPU 0. Exitting.

xmrig Cuda 9 has the same problem. 2.8.3+ works fine

@xmrig
Copy link
Owner

xmrig commented Mar 8, 2019

You should use CUDA 8.0 version, Fermi (compute capability 2.1) architecture not supported by CUDA 9.0.
Thank you.

@berezinevgeniy
Copy link
Author

berezinevgeniy commented Mar 8, 2019

have the same error on cuda 8 -(

  • ABOUT XMRig-NVIDIA/2.14.0 MSVC/2015
  • LIBS libuv/1.24.1 CUDA/8.0 OpenSSL/1.1.1a microhttpd/0.9.61
  • CPU Intel(R) Core(TM) i3-3110M CPU @ 2.40GHz x64 -AES
  • GPU #0 PCI:0000:01:00 GeForce 610M @ 950/900 MHz 32x10 8x25 arch:21 SMX
    :1
  • ALGO cryptonight, donate=5%
  • POOL please add the amd version #1 XXXXXXX:8888 variant auto
  • COMMANDS hashrate, health, pause, resume

GPU 0: unknown error
cuda_get_deviceinfo line 535
[2019-03-09 01:22:30] Setup failed for GPU 0. Exitting.

with 2.8.3 cuda 8 have no problems

@berezinevgeniy
Copy link
Author

GF 740M works fine with 2.14. Only GF 6(7,8)10-20M series fails win new xmrig.
How can i find error code from Cuda? I Can not compile new binnary with comment this part of code (cuda_get_deviceinfo). can you help me?

@Lonelysoul-HayashiNoMirai

I have the same problem too :((. Plz help.....

@xmrig
Copy link
Owner

xmrig commented Mar 10, 2019

@berezinevgeniy Can you provide the config too? currently I have no idea what cause this issue, simply calling cudaGetDeviceProperties failed with unknown error, but previous call was successful because GPU name and other information detected correctly.
Thank you.

@berezinevgeniy
Copy link
Author

berezinevgeniy commented Mar 10, 2019

it seems to unsupported cuda. The latest drivers for me is 391.35. And all of my problem videos are on Fermi, that is not supperted by Nvidia any more. All other videos that i have is on 417+ drivers and has no problem (after 2.14.1 xmrig-nvidia)

P.S. Xmr-stak have the similar error on start with Fermi GeForces after 2.10.0 update. And works with early version as a xmrig.
[2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend_cuda10_0'
WARNING: NVIDIA Insufficient driver!
WARNING: NVIDIA no device found
[2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend_cuda9_2'
WARNING: NVIDIA cannot load backend library: xmrstak_cuda_backend_cuda9_2.dll
WARNING: NVIDIA Insufficient driver!
WARNING: NVIDIA no device found
[2019-03-10 18:36:44] : NVIDIA: try to load library 'xmrstak_cuda_backend'
NVIDIA: found 1 potential device's
[2019-03-10 18:56:01] : Starting NVIDIA GPU thread 0, affinity: 36761328.
WARNING: Invalid device ID '36761376'!
[2019-03-10 18:56:01] : Setup failed for GPU 0. Exiting.

It seems something wrong with device id.

I think, need to get debug info with this error. We have no full description what happend with cudaGetDeviceProperties. Usually unknow error has detail description.
If you like, i can give you TeamViwer acces to pc with Fermi nvidia to test or you can make compile test miner with no double cudaGetDeviceProperties run (use result from previous exec this function if it works as you see) -)

Config.json
{
"algo": "cryptonight",
"api": {
"port": 0,
"access-token": null,
"id": null,
"worker-id": null,
"ipv6": false,
"restricted": true
},
"background": false,
"colors": true,
"cuda-bfactor": 6,
"cuda-bsleep": 25,
"cuda-max-threads": 64,
"donate-level": 5,
"log-file": "c:\xmrig\log.txt",
"pools": [
{
"url": "xxx",
"user": "xxx",
"pass": "x",
"rig-id": null,
"nicehash": false,
"keepalive": false,
"variant": -1,
"tls": false,
"tls-fingerprint": null
}
],
"print-time": 60,
"retries": 5,
"retry-pause": 5,
"threads": [
{
"index": 0,
"threads": 64,
"blocks": 4,
"bfactor": 8,
"bsleep": 25,
"sync_mode": 3,
"affine_to_cpu": false
}
],
"user-agent": null,
"syslog": false,
"watch": false
}

@berezinevgeniy
Copy link
Author

berezinevgeniy commented Mar 10, 2019

This builds (Cuda8 version download) from xmr-stak is working with Fermi. This is better then nothing, I belive that you help with it. Xmrig is faster, simple, stable.

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 18, 2019

I modified the failure exit and also meta-miner so it will relaunch until it works. If it doesn't work, use a larger hammer...

Sometimes its like 10 times per success. But it works OK for now. Clocking or not has no difference on the init crash. I did not test underclocking though. When it does run there are no invalids nor kernel crashes so I'm pretty sure the clocking is fine.

Also confirm somehow xmr-stak does not do this at all ever (but its init code is basically identical?)
Only real difference is they load their backend as a DLL, while here it's linked static into the main exe. Maybe chain-loading DLL to DLL just works better than the static launch for some reason?
I never quite understood why xmr-stak refuses to static link (it "should" work "identical") but a weird unexplained side effect like this might be why? Also it's more of a CPU miner with GPU plugins so having a DLL plugin makes sense there (for runtime full disable of the GPU backends) but it seems like there are more reasons than just that.

>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5108MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection

GPU 0: unknown error
cuda_get_deviceinfo line 536
[2019-03-18 12:01:30] Setup failed for GPU 0. Exiting.
!!! Miner socket error
!!! Pool (gulf.moneroocean.stream:ssl443) <-> miner link was broken due to miner socket error
!!! Miner './xmrig-nvidia80 --config=config-r.json' exited with nonzero code 1
>>> Restarting './xmrig-nvidia80 --config=config-r.json' miner that was closed unexpectedly
>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5116MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection

GPU 0: unknown error
cuda_get_deviceinfo line 536
[2019-03-18 12:01:32] Setup failed for GPU 0. Exiting.
!!! Miner socket error
!!! Pool (gulf.moneroocean.stream:ssl443) <-> miner link was broken due to miner socket error
!!! Miner './xmrig-nvidia80 --config=config-r.json' exited with nonzero code 1
>>> Restarting './xmrig-nvidia80 --config=config-r.json' miner that was closed unexpectedly
>>> Starting miner: ./xmrig-nvidia80 --config=config-r.json
 * ABOUT        XMRig-NVIDIA/2.14.2-dev MSVC/2015
 * LIBS         libuv/1.23.0 CUDA/8.0 OpenSSL/1.1.1 microhttpd/0.9.59
 * CPU                 Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz x64 AES
 * GPU #0       PCI:0000:01:00 NVS 5200M @ 1390/1976 MHz 10x40 6x25 arch:21 SMX:2 MEM:0/5119MiB
 * ALGO         cryptonight, donate=0%
 * POOL #1      127.0.0.1:3334 variant=r
 * API BIND     [::]:10081
 * COMMANDS     'h' hashrate, 'e' health, 'p' pause, 'r' resume
>>> Miner server on 127.0.0.1:3334 port connected from 127.0.0.1
>>> Pool (gulf.moneroocean.stream:ssl443) <-> miner link was established due to new miner connection
[2019-03-18 12:01:34] use pool 127.0.0.1:3334  127.0.0.1
[2019-03-18 12:01:34] new job from 127.0.0.1:3334 diff 845 algo cn/r height 1793551
[2019-03-18 12:02:01] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:01]  * GPU #0: 81C FAN 0%
[2019-03-18 12:02:03] accepted (1/0) diff 845 (63 ms)
[2019-03-18 12:02:17] accepted (2/0) diff 845 (69 ms)
[2019-03-18 12:02:25] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:25]  * GPU #0: 82C FAN 0%
[2019-03-18 12:02:49] speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2019-03-18 12:02:49]  * GPU #0: 83C FAN 0%
[2019-03-18 12:03:13] speed 10s/60s/15m n/a 29.3 n/a H/s max n/a H/s
[2019-03-18 12:03:13]  * GPU #0: 83C FAN 0%
[2019-03-18 12:03:25] accepted (3/0) diff 845 (61 ms)
[2019-03-18 12:03:37] speed 10s/60s/15m n/a 29.4 n/a H/s max n/a H/s
[2019-03-18 12:03:37]  * GPU #0: 83C FAN 0%
[2019-03-18 12:04:01] speed 10s/60s/15m n/a 29.6 n/a H/s max n/a H/s

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 18, 2019

Also dell laptop thus the no fan reporting (NVidiaInspector just greys that section out / not available)
So that is "normal". It might be nice for it to disappear when unsupported such as the power-usage.

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 18, 2019

Also cn-heavy only works at like 4x4 which is low occupancy and slow (like 25% of ideal probably)
everything else larger hits memory allocation failures at kernel init (once the init error has been brute forced)

Most algos work nice once hand-tuned. All Fermi autotuning is not even close though on most algos.
They seem to enjoy 5 times SMX (10) threads and then adjust blocks until memory allocation doesn't fail.

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 18, 2019

(The MEM: part of the detection line is a hack I'm working on obviously doesn't work yet)

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 20, 2019

PR #255 fixes this cuda_get_deviceinfo init-crash

although nobody really knows why

Comes with the memory reporting too, and support for -DCUDA_ARCH=21 by itself (2% faster on mine vs "20" code)

@Spudz76
Copy link
Contributor

Spudz76 commented Mar 20, 2019

@berezinevgeniy don't forget to come back sometime and check this thread

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants