Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdserving 部署PP-OCR4 超显存 #7

Open
shiyutang opened this issue Aug 28, 2023 · 4 comments
Open

pdserving 部署PP-OCR4 超显存 #7

shiyutang opened this issue Aug 28, 2023 · 4 comments
Labels

Comments

@shiyutang
Copy link
Owner

shiyutang commented Aug 28, 2023

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:docker T4 16G 显存
  • 版本号/Version:paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8 PaddleOCR:develop
  • 运行指令/Command Code:python3 web_service.py --config=config.yml
  • 完整报错/Complete Error Message:Out of memory error on GPU 0. Cannot allocate 225.000000MB memory on GPU 0, 14.569336GB memory has been allocated and available memory is only 190.750000MB.

Please check whether there is any other process using GPU 0.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.
    If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
    (at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
    W0828 07:40:58.236891 205 operator.cc:288] concat raises an exception paddle::memory::allocation::BadAlloc, ResourceExhaustedError:

Out of memory error on GPU 0. Cannot allocate 289.078125MB memory on GPU 0, 14.514648GB memory has been allocated and available memory is only 246.750000MB.

Please check whether there is any other process using GPU 0.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.
    If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is export FLAGS_use_cuda_managed_memory=false.
    (at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no): yes

请尽量不要包含图片在问题中/Please try to not include the image in the issue.

@aceissuesolver
Copy link

Issue Response

  • Introduction: Thanks for opening this issue, I am the AceIssueSolver that will help you with your questoins!
  • 🎯 First round response: The answer to your question is:

根据提供的信息,您遇到了GPU显存不足的问题。根据错误信息,您可以尝试以下解决方法:

  1. 检查是否有其他进程正在使用GPU 0。如果有,请停止它们,或者将PaddlePaddle启动在另一个GPU上。
  2. 如果没有其他进程在使用GPU 0,请减小模型的批处理大小。
  3. 如果以上方法都无法解决显存不足的问题,您可以尝试使用CUDA managed memory。命令是export FLAGS_use_cuda_managed_memory=false

请注意,您提供的信息中还包含了版本号、运行指令和完整报错信息,这些信息对于定位问题也非常重要。如果以上方法都无法解决您的问题,请提供更多的信息以便我们能够帮助您解决。

How to use

To invoke the AceIssueSolver, add a comment using one of the following commands:
@Ace-issue-solver <QUESTION>: Pose more questions on AceIssueSolver.

@shiyutang
Copy link
Owner Author

有可能是输入图像太大,但是我在DB检测网络中已经限制最长边:

class DetOp(Op):
    def init_op(self):
        self.det_preprocess = Sequential([
            DetResizeForTest(limit_side_len=960, limit_type="max"), Div(255),
            Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose(
                (2, 0, 1))
        ])
        self.filter_func = FilterBoxes(10, 10)
        self.post_func = DBPostProcess({
            "thresh": 0.3,
            "box_thresh": 0.6,
            "max_candidates": 1000,
            "unclip_ratio": 1.5,
            "min_size": 3
        })

@shiyutang
Copy link
Owner Author

shiyutang commented Aug 29, 2023

@Ace-issue-solver 有办法限制最大使用显存吗?总是OOM。

Repository owner deleted a comment from aceissuesolver bot Sep 7, 2023
Repository owner deleted a comment from aceissuesolver bot Sep 7, 2023
Repository owner deleted a comment from aceissuesolver bot Sep 7, 2023
Repository owner deleted a comment from aceissuesolver bot Sep 7, 2023
@1037419569
Copy link

我记得之前我把进程改为线程,显存就降下来了,但是不知道改为线程会有什么坏处

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants