-
-
Notifications
You must be signed in to change notification settings - Fork 16.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YOLOv5 Apple Metal Performance Shader (MPS) support #7878
Conversation
Following https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ Should work with Apple M1 devices with PyTorch nightly installed with command `--device mps`. Usage examples: ```bash python train.py --device mps python detect.py --device mps python val.py --device mps ```
Raised issue in pytorch/pytorch#77748 |
Python-3.9.13 torch-1.11.0 (Macbook Air M1) - CPU(venv) (base) glennjocher@Glenns-MacBook-Air yolov5 % python detect.py
detect: weights=yolov5s.pt, source=data/images, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.1-212-g7c13c46 Python-3.9.13 torch-1.11.0 CPU
Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients
image 1/2 /Users/glennjocher/PycharmProjects/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, Done. (0.084s)
image 2/2 /Users/glennjocher/PycharmProjects/yolov5/data/images/zidane.jpg: 384x640 2 persons, 2 ties, Done. (0.068s)
Speed: 0.4ms pre-process, 76.1ms inference, 0.5ms NMS per image at shape (1, 3, 640, 640) Python-3.9.13 torch-1.11.0 (Macbook Air M1) - MPS
|
* Apple Metal Performance Shader (MPS) device support Following https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ Should work with Apple M1 devices with PyTorch nightly installed with command `--device mps`. Usage examples: ```bash python train.py --device mps python detect.py --device mps python val.py --device mps ``` * Update device strategy to fix MPS issue
@RacerChen if you've installed pytorch nightly and you have a supported device then the correct usage example would be:
|
Thanks for answering. My machine is MacBook Air M1 2020. I already installed the pytorch nightly. I trained model with
By the way, if I remove the |
@RacerChen had the same issue here. Try uninstalling torch, torchvision and torchaudio by running |
@Djordi97 Thanks a lot, I got it. And there is an easy way that cloning a new yolov5 project and configuring it again. : ) Now it works, but not totally. I am now facing the same problem of Error: |
@RacerChen the |
By using the mentioned command to start training Note that installed PyTorch Nighty from their official website using the mentioned commands. |
@mohammed-ab99 PyTorch team is aware of ongoing MPS issues tracked in pytorch/pytorch#77886 but I can't tell from your message if this falls under that. Are you saying |
@glenn-jocher I am trying to train on my custom data and falling with this error. It is generated after running This is the complete traceback: Also there is a warning that is being generated before throwing the error: |
@mohammed-ab99 the first might be resolved by reducing any FP64 variables to FP32. Do you know which variable is FP64? The second issue is already open in #8508 |
@glenn-jocher Actually I am using the code as is without any modifications, but according to the traceback it is in this line: This is the
But I am not sure if this is the variable or it is another one. |
Avoid FP64 ops for MPS support Resolves #7878 (comment)
@mohammed-ab99 good news 😃! Your original issue may now be fixed ✅ in PR #8511. To receive this update:
Thank you for spotting this issue and informing us of the problem. This likely won't resolve all issues for you, so if you run into another error on training with MPS please let us know. |
@glenn-jocher thanks the error disappeared now. However, I think that now I have fallen to PyTorch support problem for MPS as this error appeared: I will be following up with the other issue. |
@mohammed-ab99 what line is causing that error? |
@glenn-jocher This is the traceback:
This is the loop inside
` |
@mohammed-ab99 got it. Seems like aten::nonzero is required for the indexing op on loss.py L208, as well as in NMS. I would stop using PYTORCH_ENABLE_MPS_FALLBACK=1 and start debugging loss.py L208 to see if you can restructure this op in a different way that bypasses the aten:nonzero requirement. I don't have availability right now to do this but I'll add a TODO to track this closer. Line 208 in 526e650
|
@mohammed-ab99 for example you could try using https://pytorch.org/docs/stable/generated/torch.index_select.html |
@mohammed-ab99 I noticed that j is also a boolean tensor. Perhaps you need to use torch.nonzero to get True indices on the boolean vector and then that might work. https://pytorch.org/docs/stable/generated/torch.nonzero.html |
Hopefully this can be fixed later on as well as the MPS officially. Either ways, thanks for your notes I ll check and let you know. |
@mohammed-ab99 well yes, ideally the torch team should fix this but without a clear schedule we should try to debug alternative implementations on our end, making sure to profile any changes for speed differences. |
@mohammed-ab99 I should be able to test on our M1 Macbook this weekend. |
That sounds good !! Thanks dear. |
Avoid FP64 ops for MPS support Resolves ultralytics#7878 (comment)
I am running into the same issue that @mohammed-ab99 is having #7878 (comment) I attempted to replace
I am able to show that index_select with nonzero is the same as indexing on nonzero for 2D case, but 3D I am having a hard time how to reshape:
|
* Apple Metal Performance Shader (MPS) device support Following https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ Should work with Apple M1 devices with PyTorch nightly installed with command `--device mps`. Usage examples: ```bash python train.py --device mps python detect.py --device mps python val.py --device mps ``` * Update device strategy to fix MPS issue
Avoid FP64 ops for MPS support Resolves ultralytics#7878 (comment)
我使用的是m1max芯片的mac,在使用train.py文件的时候提示这个报错,这是什么问题 |
Hello @xxxkkw, Thank you for reporting this issue. It appears that you're encountering an assertion error related to the Metal Performance Shaders (MPS) backend on your M1 Max chip when running the training script. Steps to Troubleshoot:
Example Code:Here's a snippet to help you with the tensor indexing: import torch
# Example tensor
t = torch.randn(3, 5, 7)
j = torch.tensor([True, False, True])
# Indexing with nonzero
t = torch.index_select(t, dim=0, index=j.nonzero(as_tuple=False).squeeze())
print(t.shape) Reporting Bugs:If the issue is reproducible with the latest versions, please consider opening a bug report on the PyTorch GitHub repository with detailed information about your setup and the error message. Thank you for your patience and contributions to improving YOLOv5! If you have any further questions, feel free to ask. |
Following https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ posted in pytorch/pytorch#47702
Should work with Apple M1 devices with PyTorch nightly installed with command
--device mps
. Usage examples:EDIT: Requires universal2 installer with Python>=3.9.1 from https://www.python.org/downloads/macos/ using command:
EDIT2: Pending new nightly torchvision arm64 distribution in pytorch/vision#6050
EDIT3: PR is merged, PyTorch seems to have a few TODOs on their side which are out of my control: 1) create torchvision nightly arm64, 2) resolve
buffer is not large enough
error reported by many users pytorch/pytorch#77748 (comment)🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Enhanced device compatibility in model loading and environment configuration for Ultralytics YOLOv5.
📊 Key Changes
attempt_load
function signature: Replacedmap_location
parameter withdevice
for clarity.device
argument directly.select_device
function.🎯 Purpose & Impact