Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么训练起来特别慢 #4

Open
kingdeewang opened this issue Apr 2, 2018 · 8 comments
Open

为什么训练起来特别慢 #4

kingdeewang opened this issue Apr 2, 2018 · 8 comments

Comments

@kingdeewang
Copy link

在chat目录下运行py train.py,几天才跑了一个epoch是啥原因?

@qhduan
Copy link
Owner

qhduan commented Apr 2, 2018

没用GPU或者cuda配置不对就特别的慢

@shizhediao
Copy link

shizhediao commented Apr 11, 2018

2018-04-11 21:24:01.266522: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-04-11 21:24:01.745824: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_INVALID_DEVICE
2018-04-11 21:24:01.745901: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: LCWM-GPU-20
2018-04-11 21:24:01.745914: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: LCWM-GPU-20
2018-04-11 21:24:01.745961: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 367.48.0
2018-04-11 21:24:01.746002: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.48 Sat Sep 3 18:21:08 PDT 2016
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
"""
2018-04-11 21:24:01.746033: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 367.48.0
2018-04-11 21:24:01.746046: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 367.48.0
epoch 1 loss=6.931528 lr=0.001000: 0%| | 17/104630 [03:37<328:16:09, 11.30s/it]

请问显示这样的信息,请问是正常使用了GPU训练吗。。。
我看有个error是不是错了。。。
感觉还是特别慢

@qhduan
Copy link
Owner

qhduan commented Apr 11, 2018

2018-04-11 21:24:01.745824: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_INVALID_DEVICE

会有这样提示一般都不行啦

@shizhediao
Copy link

谢谢!!!
请问如果用GTX1080训练,大概需要多长时间呢?

@qhduan
Copy link
Owner

qhduan commented Apr 11, 2018

参数不一样区别很大

@yaleimeng
Copy link

demo里面默认的dgk_shooter_min.conv太大了,可以换一个小的。体积越小,训练越快

@charlesXu86
Copy link

我想问一下,你们训练的时候没有多线程报错的问题吗

@NexusLee
Copy link

@charlesXu86 有 老哥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants