Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dockers] install nvidia-dali-cudaXXX #4532

Merged
merged 14 commits into from
Nov 9, 2020
Merged

[dockers] install nvidia-dali-cudaXXX #4532

merged 14 commits into from
Nov 9, 2020

Conversation

ydcjeff
Copy link
Contributor

@ydcjeff ydcjeff commented Nov 5, 2020

What does this PR do?

Fixes dali installation for #3721

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@ydcjeff ydcjeff added feature Is an improvement or enhancement ci Continuous Integration labels Nov 5, 2020
@ydcjeff ydcjeff added this to the 1.0.x milestone Nov 5, 2020
@ydcjeff ydcjeff requested a review from Borda November 5, 2020 15:35
@codecov
Copy link

codecov bot commented Nov 5, 2020

Codecov Report

Merging #4532 (5227529) into master (4a6721a) will not change coverage.
The diff coverage is n/a.

@@          Coverage Diff           @@
##           master   #4532   +/-   ##
======================================
  Coverage      93%     93%           
======================================
  Files         116     116           
  Lines        8833    8833           
======================================
  Hits         8232    8232           
  Misses        601     601           

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we install it from source rather?

dockers/base-conda/Dockerfile Outdated Show resolved Hide resolved
dockers/base-conda/Dockerfile Outdated Show resolved Hide resolved
dockers/base-cuda/Dockerfile Outdated Show resolved Hide resolved
dockers/base-cuda/Dockerfile Outdated Show resolved Hide resolved
@Borda Borda changed the title [dockers] install nvidia-dali-cuda100 [dockers] install nvidia-dali-cudaXXX Nov 5, 2020
@Borda
Copy link
Member

Borda commented Nov 5, 2020

ok, I have tried to install specific versions according to CUDA version, but it seems to fail as they have 100 or 110 only...
then I am nor sure about compatibility as we use CUDA 10.1 or 10.2...
lets check installation from source

@SeanNaren
Copy link
Contributor

ok, I have tried to install specific versions according to CUDA version, but it seems to fail as they have 100 or 110 only...
then I am nor sure about compatibility as we use CUDA 10.1 or 10.2...
lets check installation from source

@Borda I've tested their cuda 100 version with cuda 10.2, does it make a difference? I think dali-cuda100 should work for both

@Borda
Copy link
Member

Borda commented Nov 5, 2020

@Borda I've tested their cuda 100 version with cuda 10.2, does it make a difference? I think dali-cuda100 should work for both

I would use the compiled one in Drone config and here rather build for a specific version as we also do for APEx now
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/compilation.html

@SeanNaren SeanNaren self-requested a review November 5, 2020 19:38
@SeanNaren
Copy link
Contributor

Ok cool, let's see how this works!

@Borda Borda changed the title [dockers] install nvidia-dali-cudaXXX [dockers] install nvidia-dali-cudaXXX [wip] Nov 5, 2020
@Borda
Copy link
Member

Borda commented Nov 6, 2020

@ydcjeff I got it up to some strange errors with parsing, mind have a look?

CMake Error at cmake/Utils.cmake:345 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/operators/python_function/CMakeLists.txt:19 (build_per_python_lib)

CMake Error at cmake/Utils.cmake:351 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/operators/python_function/CMakeLists.txt:19 (build_per_python_lib)

CMake Error at cmake/Utils.cmake:352 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/operators/python_function/CMakeLists.txt:19 (build_per_python_lib)

-- Adding dependencies to target `dali`: '/usr/local/cuda/targets/x86_64-linux/lib/libnvjpeg_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnppicc_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnppc_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libculibos.a;opencv_core;opencv_imgproc;opencv_imgcodecs;/usr/lib/x86_64-linux-gnu/libjpeg.so;/usr/lib/x86_64-linux-gnu/libtiff.so;/usr/lib/x86_64-linux-gnu/libsndfile.so;avformat;avformat;avcodec;avfilter;avutil;ffts;CUTLASS;cocoapi;/usr/lib/x86_64-linux-gnu/libprotobuf.a;/usr/local/cuda/targets/x86_64-linux/lib/libcudart_static.a;rt;pthread;m;dl'
CMake Error at cmake/Utils.cmake:345 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/python/CMakeLists.txt:19 (build_per_python_lib)

CMake Error at cmake/Utils.cmake:351 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/python/CMakeLists.txt:19 (build_per_python_lib)

CMake Error at cmake/Utils.cmake:352 (string):
  string sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
  dali/python/CMakeLists.txt:19 (build_per_python_lib)

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Nov 6, 2020

@Borda @SeanNaren does dali only run with gpu, right?
I would like to install binary if there is no major difference.
also we are testing the examples only on gpu
and i thought most end users won't compile from source to use dali except from big cooperates.

Not sure what you think?

@Borda
Copy link
Member

Borda commented Nov 6, 2020

I spend some time yesterday a few hours trying to compile it...
I would give it a little more to try - fix the cmake issue above or open issue on DALI project
the ultimate solution would be using the binary and hope that it be compatible with the particular CUDAs

@SeanNaren
Copy link
Contributor

I agree with binaries, I even think that this is something that should be rigorously tested as it is a third party library required for users. I wouldn't spend too much time thinking about building from source and just either moving with the binary, or just skipping the test.

@Borda
Copy link
Member

Borda commented Nov 6, 2020

@SeanNaren @ydcjeff reverted to binaries :]

@ydcjeff ydcjeff changed the title [dockers] install nvidia-dali-cudaXXX [wip] [dockers] install nvidia-dali-cudaXXX Nov 7, 2020
@Borda Borda added the ready PRs ready to be merged label Nov 9, 2020
@ydcjeff ydcjeff merged commit 23719e3 into master Nov 9, 2020
@ydcjeff ydcjeff deleted the dockers/dali branch November 9, 2020 14:48
@edenlightning edenlightning modified the milestones: 1.0.x, 1.1 Nov 10, 2020
rohitgr7 pushed a commit that referenced this pull request Nov 21, 2020
* [dockers] install nvidia-dali-cuda100

* Apply suggestions from code review

* build DALI

* build DALI

* build DALI

* dali from source

* dali from source

* use binaries

* qq

Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Sean Naren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration feature Is an improvement or enhancement ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants