Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluate时显存暴增 #15

Open
toddlt opened this issue Nov 24, 2023 · 2 comments
Open

evaluate时显存暴增 #15

toddlt opened this issue Nov 24, 2023 · 2 comments

Comments

@toddlt
Copy link

toddlt commented Nov 24, 2023

在trainer中添加了eval_dataset,写了compute_metric函数来计算eval中的一些指标,比如funtion calling的precision/recall和回复文本的bleu score。

遇到问题,evaluate时内存暴增,本来训练时10+GB显存占用,到了eval时突然增到60GB+,最终增到OOM

请问你有遇到过类似的情况吗?

@xxw1995
Copy link
Owner

xxw1995 commented Nov 25, 2023

大概是eval的时候没有加载lora

@toddlt
Copy link
Author

toddlt commented Nov 25, 2023

那也挺奇怪,加载lora应该不费多少显存的;而且这个问题在pt2上也存在,eval也会爆显存

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants