Prefer nvidia channel for conda builds #5648

malfet · 2022-03-20T01:19:50Z

Fixes #5635

facebook-github-bot · 2022-03-20T01:19:58Z

💊 CI failures summary and remediations

As of commit 876bb30 (more details on the Dr. CI page):

3/3 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

binary_linux_conda_py3.8_cu115 (1/3)

Step: "packaging/build_conda.sh" (full log | diagnosis details | 🔁 rerun)

$SRC_DIR/torchvision/csrc/io/decoder/decoder.cp...tFormat*’ to ‘AVInputFormat*’ [-fpermissive]

          ^~~~~~~~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:33:10: note: suggested alternative: ‘AV_LOG_ERROR’
     case AV_LOCK_DESTROY:
          ^~~~~~~~~~~~~~~
          AV_LOG_ERROR
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In lambda function:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:206:5: error: ‘av_lockmgr_register’ was not declared in this scope
     av_lockmgr_register(&ffmpeg_lock);
     ^~~~~~~~~~~~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In member function ‘virtual bool ffmpeg::Decoder::init(const ffmpeg::DecoderParameters&, ffmpeg::DecoderInCallback&&, std::vector<ffmpeg::DecoderMetadata>*)’:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:280:33: error: invalid conversion from ‘const AVInputFormat*’ to ‘AVInputFormat*’ [-fpermissive]
       fmt = av_find_input_format(fmtName);
             ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In member function ‘int ffmpeg::Decoder::getFrame(size_t)’:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:509:27: warning: ‘void av_init_packet(AVPacket*)’ is deprecated [-Wdeprecated-declarations]
   av_init_packet(&avPacket);
                           ^
In file included from $BUILD_PREFIX/include/libavcodec/avcodec.h:45:0,
                 from $SRC_DIR/torchvision/csrc/io/decoder/defs.h:12,
                 from $SRC_DIR/torchvision/csrc/io/decoder/seekable_buffer.h:3,
                 from $SRC_DIR/torchvision/csrc/io/decoder/decoder.h:5,

binary_linux_conda_py3.10_cu115 (2/3)

Step: "packaging/build_conda.sh" (full log | diagnosis details | 🔁 rerun)

$SRC_DIR/torchvision/csrc/io/decoder/decoder.cp...tFormat*’ to ‘AVInputFormat*’ [-fpermissive]

          ^~~~~~~~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:33:10: note: suggested alternative: ‘AV_LOG_ERROR’
     case AV_LOCK_DESTROY:
          ^~~~~~~~~~~~~~~
          AV_LOG_ERROR
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In lambda function:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:206:5: error: ‘av_lockmgr_register’ was not declared in this scope
     av_lockmgr_register(&ffmpeg_lock);
     ^~~~~~~~~~~~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In member function ‘virtual bool ffmpeg::Decoder::init(const ffmpeg::DecoderParameters&, ffmpeg::DecoderInCallback&&, std::vector<ffmpeg::DecoderMetadata>*)’:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:280:33: error: invalid conversion from ‘const AVInputFormat*’ to ‘AVInputFormat*’ [-fpermissive]
       fmt = av_find_input_format(fmtName);
             ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp: In member function ‘int ffmpeg::Decoder::getFrame(size_t)’:
$SRC_DIR/torchvision/csrc/io/decoder/decoder.cpp:509:27: warning: ‘void av_init_packet(AVPacket*)’ is deprecated [-Wdeprecated-declarations]
   av_init_packet(&avPacket);
                           ^
In file included from $BUILD_PREFIX/include/libavcodec/avcodec.h:45:0,
                 from $SRC_DIR/torchvision/csrc/io/decoder/defs.h:12,
                 from $SRC_DIR/torchvision/csrc/io/decoder/seekable_buffer.h:3,
                 from $SRC_DIR/torchvision/csrc/io/decoder/decoder.h:5,

binary_linux_conda_py3.7_cu115 (3/3)

Step: "packaging/build_conda.sh" (full log | diagnosis details | 🔁 rerun)

error_prefix='Error compiling objects for extension')

    self._build_extensions_serial()
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 593, in unix_wrap_ninja_compile
    with_cuda=with_cuda)
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1473, in _write_ninja_file_and_compile_objects
    error_prefix='Error compiling objects for extension')
  File "/opt/conda/conda-bld/torchvision_1647881675966/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1805, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
Traceback (most recent call last):
  File "/opt/conda/bin/conda-build", line 11, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.9/site-packages/conda_build/cli/main_build.py", line 488, in main
    execute(sys.argv[1:])
  File "/opt/conda/lib/python3.9/site-packages/conda_build/cli/main_build.py", line 477, in execute
    outputs = api.build(args.recipe, post=args.post, test_run_post=args.test_run_post,

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

datumbox

Thanks @malfet.

Is this a temporary update to fix the issue or does it mean we will change our recommended channel on the download page at pytorch.org?

datumbox · 2022-03-21T09:15:09Z

@malfet The issue persists on the cmake_linux_gpu job:
ImportError: libcupti.so.11.3: cannot open shared object file: No such file or directory

The binary_linux_conda_*_cu115 jobs are failing because they use ffmpeg-5.0.0 which deprecates some functionality that we currently use. There is another PR that attempts to fix this at #5644.

I think we should merge this after the cmake job is also fixed and then resolve the rest of the jobs on the other PR.

1 job still failing

datumbox

LGTM, thanks for patching cmake.

Let's merge on "green-ish" CI.

github-actions · 2022-03-21T17:59:17Z

Hey @malfet!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

To mitigate missing `libcupti.so` dependency

* added usps dataset * fixed type issues * fix mobilnet norm layer test (#5643) * xfail mobilnet norm layer test * fix test * More robust check in tests for 16 bits images (#5652) * Prefer nvidia channel for conda builds (#5648) To mitigate missing `libcupti.so` dependency * fix torchdata CI installation (#5657) * update urls for kinetics dataset (#5578) * update urls for kinetics dataset * update urls for kinetics dataset * remove errors * update the changes and add test option to split * added test to valid values for split arg * change .txt to .csv for annotation url of k600 Co-authored-by: Nicolas Hug <[email protected]> * Port Multi-weight support from prototype to main (#5618) * Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet. * Porting googlenet * Porting inception * Porting mnasnet * Porting mobilenetv2 * Porting mobilenetv3 * Porting regnet * Porting resnet * Porting shufflenetv2 * Porting squeezenet * Porting vgg * Porting vit * Fix docstrings * Fixing imports * Adding missing import * Fix mobilenet imports * Fix tests * Fix prototype tests * Exclude get_weight from models on test * Fix init files * Porting googlenet * Porting inception * porting mobilenetv2 * porting mobilenetv3 * porting resnet * porting shufflenetv2 * Fix test and linter * Fixing docs. * Porting Detection models (#5617) * fix inits * fix docs * Port faster_rcnn * Port fcos * Port keypoint_rcnn * Port mask_rcnn * Port retinanet * Port ssd * Port ssdlite * Fix linter * Fixing tests * Fixing tests * Fixing vgg test * Porting Optical Flow, Segmentation, Video models (#5619) * Porting raft * Porting video resnet * Porting deeplabv3 * Porting fcn and lraspp * Fixing the tests and linter * Porting docs, examples, tutorials and galleries (#5620) * Fix examples, tutorials and gallery * Update gallery/plot_optical_flow.py Co-authored-by: Nicolas Hug <[email protected]> * Fix import * Revert hardcoded normalization * fix uncommitted changes * Fix bug * Fix more bugs * Making resize optional for segmentation * Fixing preset * Fix mypy * Fixing documentation strings * Fix flake8 * minor refactoring Co-authored-by: Nicolas Hug <[email protected]> * Resolve conflict * Porting model tests (#5622) * Porting tests * Remove unnecessary variable * Fix linter * Move prototype to extended tests * Fix download models job * Update CI on Multiweight branch to use the new weight download approach (#5628) * port Pad to prototype transforms (#5621) * port Pad to prototype transforms * use literal * Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624) Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Vasilis Vryniotis <[email protected]> * pre-download model weights in CI docs build (#5625) * pre-download model weights in CI docs build * move changes into template * change docs image * Regenerated config.yml Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]> * Porting reference scripts and updating presets (#5629) * Making _preset.py classes * Remove support of targets on presets. * Rewriting the video preset * Adding tests to check that the bundled transforms are JIT scriptable * Rename all presets from *Eval to *Inference * Minor refactoring * Remove --prototype and --pretrained from reference scripts * remove pretained_backbone refs * Corrections and simplifications * Fixing bug * Fixing linter * Fix flake8 * restore documentation example * minor fixes * fix optical flow missing param * Fixing commands * Adding weights_backbone support in detection and segmentation * Updating the commands for InceptionV3 * Setting `weights_backbone` to its fully BC value (#5653) * Replace default `weights_backbone=None` with its BC values. * Fixing tests * Fix linter * Update docs. * Update preprocessing on reference scripts. * Change qat/ptq to their full values. * Refactoring preprocessing * Fix video preset * No initialization on VGG if pretrained * Fix warning messages for backbone utils. * Adding star to all preset constructors. * Fix mypy. Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]> * Apply suggestions from code review Co-authored-by: Philip Meier <[email protected]> * use decompressor for extracting bz2 * Apply suggestions from code review Co-authored-by: Philip Meier <[email protected]> * Apply suggestions from code review Co-authored-by: Philip Meier <[email protected]> * fixed lint fails * added tests for USPS * check image shape * fix tests * check shape on image directly * Apply suggestions from code review Co-authored-by: Philip Meier <[email protected]> * removed test and comments * Update test/test_prototype_builtin_datasets.py Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Nikita Shulga <[email protected]> Co-authored-by: Sahil Goyal <[email protected]> Co-authored-by: Vasilis Vryniotis <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]>

Summary: To mitigate missing `libcupti.so` dependency (Note: this ignores all push blocking failures!) Reviewed By: datumbox Differential Revision: D35216765 fbshipit-source-id: 99e9ac632b08961011b56a6e9b9a9ecce670fe48

Summary: * added usps dataset * fixed type issues * fix mobilnet norm layer test (#5643) * xfail mobilnet norm layer test * fix test * More robust check in tests for 16 bits images (#5652) * Prefer nvidia channel for conda builds (#5648) To mitigate missing `libcupti.so` dependency * fix torchdata CI installation (#5657) * update urls for kinetics dataset (#5578) * update urls for kinetics dataset * update urls for kinetics dataset * remove errors * update the changes and add test option to split * added test to valid values for split arg * change .txt to .csv for annotation url of k600 * Port Multi-weight support from prototype to main (#5618) * Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet. * Porting googlenet * Porting inception * Porting mnasnet * Porting mobilenetv2 * Porting mobilenetv3 * Porting regnet * Porting resnet * Porting shufflenetv2 * Porting squeezenet * Porting vgg * Porting vit * Fix docstrings * Fixing imports * Adding missing import * Fix mobilenet imports * Fix tests * Fix prototype tests * Exclude get_weight from models on test * Fix init files * Porting googlenet * Porting inception * porting mobilenetv2 * porting mobilenetv3 * porting resnet * porting shufflenetv2 * Fix test and linter * Fixing docs. * Porting Detection models (#5617) * fix inits * fix docs * Port faster_rcnn * Port fcos * Port keypoint_rcnn * Port mask_rcnn * Port retinanet * Port ssd * Port ssdlite * Fix linter * Fixing tests * Fixing tests * Fixing vgg test * Porting Optical Flow, Segmentation, Video models (#5619) * Porting raft * Porting video resnet * Porting deeplabv3 * Porting fcn and lraspp * Fixing the tests and linter * Porting docs, examples, tutorials and galleries (#5620) * Fix examples, tutorials and gallery * Update gallery/plot_optical_flow.py * Fix import * Revert hardcoded normalization * fix uncommitted changes * Fix bug * Fix more bugs * Making resize optional for segmentation * Fixing preset * Fix mypy * Fixing documentation strings * Fix flake8 * minor refactoring * Resolve conflict * Porting model tests (#5622) * Porting tests * Remove unnecessary variable * Fix linter * Move prototype to extended tests * Fix download models job * Update CI on Multiweight branch to use the new weight download approach (#5628) * port Pad to prototype transforms (#5621) * port Pad to prototype transforms * use literal * Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624) * pre-download model weights in CI docs build (#5625) * pre-download model weights in CI docs build * move changes into template * change docs image * Regenerated config.yml * Porting reference scripts and updating presets (#5629) * Making _preset.py classes * Remove support of targets on presets. * Rewriting the video preset * Adding tests to check that the bundled transforms are JIT scriptable * Rename all presets from *Eval to *Inference * Minor refactoring * Remove --prototype and --pretrained from reference scripts * remove pretained_backbone refs * Corrections and simplifications * Fixing bug * Fixing linter * Fix flake8 * restore documentation example * minor fixes * fix optical flow missing param * Fixing commands * Adding weights_backbone support in detection and segmentation * Updating the commands for InceptionV3 * Setting `weights_backbone` to its fully BC value (#5653) * Replace default `weights_backbone=None` with its BC values. * Fixing tests * Fix linter * Update docs. * Update preprocessing on reference scripts. * Change qat/ptq to their full values. * Refactoring preprocessing * Fix video preset * No initialization on VGG if pretrained * Fix warning messages for backbone utils. * Adding star to all preset constructors. * Fix mypy. * Apply suggestions from code review * use decompressor for extracting bz2 * Apply suggestions from code review * Apply suggestions from code review * fixed lint fails * added tests for USPS * check image shape * fix tests * check shape on image directly * Apply suggestions from code review * removed test and comments * Update test/test_prototype_builtin_datasets.py (Note: this ignores all push blocking failures!) Reviewed By: datumbox Differential Revision: D35216783 fbshipit-source-id: 556a63a89f15d1541ac2b479244a7b6c564eff14 Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Vasilis Vryniotis <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Nicolas Hug <[email protected]> Co-authored-by: Nikita Shulga <[email protected]> Co-authored-by: Sahil Goyal <[email protected]> Co-authored-by: Vasilis Vryniotis <[email protected]> Co-authored-by: Anton Thomma <[email protected]> Co-authored-by: Anton Thomma <[email protected]>

pytorch-bot bot added the ciflow/default label Mar 20, 2022

facebook-github-bot added the cla signed label Mar 20, 2022

malfet requested review from atalman and datumbox March 20, 2022 04:17

datumbox previously approved these changes Mar 20, 2022

View reviewed changes

malfet added 3 commits March 21, 2022 09:49

Prefer nvidia channel for conda builds

43e2f8f

And same for unittests

7059798

And here

876bb30

malfet force-pushed the malfet/prefer-nvidia-channel-for-conda-builds branch from 33a72da to 876bb30 Compare March 21, 2022 16:52

datumbox approved these changes Mar 21, 2022

View reviewed changes

malfet merged commit fbc8ea4 into main Mar 21, 2022

malfet deleted the malfet/prefer-nvidia-channel-for-conda-builds branch March 21, 2022 17:58

datumbox added topic: build module: ci other if you have no clue or if you will manually handle the PR in the release notes labels Mar 21, 2022

lezwon pushed a commit to lezwon/vision that referenced this pull request Mar 23, 2022

Prefer nvidia channel for conda builds (pytorch#5648)

7fd2ea0

To mitigate missing `libcupti.so` dependency

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer nvidia channel for conda builds #5648

Prefer nvidia channel for conda builds #5648

malfet commented Mar 20, 2022

facebook-github-bot commented Mar 20, 2022 •

edited

Loading

datumbox left a comment

datumbox commented Mar 21, 2022

datumbox left a comment

github-actions bot commented Mar 21, 2022

Prefer nvidia channel for conda builds #5648

Prefer nvidia channel for conda builds #5648

Conversation

malfet commented Mar 20, 2022

facebook-github-bot commented Mar 20, 2022 • edited Loading

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

binary_linux_conda_py3.8_cu115 (1/3)

binary_linux_conda_py3.10_cu115 (2/3)

binary_linux_conda_py3.7_cu115 (3/3)

datumbox left a comment

Choose a reason for hiding this comment

datumbox commented Mar 21, 2022

datumbox left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 21, 2022

facebook-github-bot commented Mar 20, 2022 •

edited

Loading