Clarification on MetaRM-optimization Implementation #42

Benjamin-eecs · 2024-01-15T17:17:29Z

Hi there,

I've read your paper and am intrigued by the MetaRM-optimization algorithm. Could you share details on the gradient computation and any specific conditions for the update step (θ_t+1 ← θ_t - α ∇θ_t Lθ'(X_t))? Any reference in open-source code would also be appreciated. Thanks a lot!

Ablustrund · 2024-01-31T02:45:25Z

Hi, thank you very much for your attention!
The release of the MetaRM code has been delayed as we have been occupied with paper submissions. We will be publishing the open-source code soon.
Can @liuyan please provide an explanation of the technical details of MetaRM here beforehand?

grace-skaiii · 2024-01-31T06:26:41Z

@Benjamin-eecs Sorry for the late reply!

During each iteration, we back up the model parameters and then sequentially perform Meta-process and MetaRM-optimization.
Regarding the MetaRM-optimization you are concerned about, we first conduct the backpropagation to obtain gradients. Subsequently, we copy the backed-up model parameters here and proceed with gradient descent.
Since gradient descent involves updating parameters within the optimizer for backing up and resetting parameters we particularly use the deepspeed implementation's interfaces, safe_get_full_fp32_param and safe_set_full_fp32_param.
For specific details of the interfaces, please refer to the official document below.
https://deepspeed.readthedocs.io/en/stable/zero3.html#deepspeed.utils.safe_get_full_fp32_param

I hope this answer proves helpful to you! Feel free to raise any questions if you have any.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on MetaRM-optimization Implementation #42

Clarification on MetaRM-optimization Implementation #42

Benjamin-eecs commented Jan 15, 2024

Ablustrund commented Jan 31, 2024

grace-skaiii commented Jan 31, 2024

Clarification on MetaRM-optimization Implementation #42

Clarification on MetaRM-optimization Implementation #42

Comments

Benjamin-eecs commented Jan 15, 2024

Ablustrund commented Jan 31, 2024

grace-skaiii commented Jan 31, 2024