SVRG update at full gradient too late? #34

KrisThielemans · 2024-05-02T09:21:07Z

@epapoutsellis and i saw differences between this SVRG implementation and a test-one from myself. Checking the code

StochasticCIL/Wrappers/Python/cil/optimisation/functions/SVRGFunction.py

Lines 81 to 114 in f9f67d5

    
           if self.svrg_iter == 0 or update_flag : 
        
               # update memory 
        
               self.update_memory(x)         
        
               if self.svrg_iter == 0: 
        
                   # initialise data passes by one due to full gradient computation from update memory 
        
                   self.data_passes[0] = 1. 
        
               else: 
        
                   # increment by 1, since full gradient is computed again 
        
                   self.data_passes.append(round(self.data_passes[-1] + 1.,4)) 
        
               # allocate memory for the difference between the gradient of selected function at iterate  
        
               # and the gradient at snapshot 
        
               # this allocates at the beginning of every update_flag: is wrong 
        
               # self.stochastic_grad_difference = x * 0.   
        
           else: 
        
               # implements the (L)SVRG inner loop             
        
               self.functions[function_num].gradient(x, out=self.stoch_grad_at_iterate) 
        
               if self.store_gradients is True: 
        
                   self.stoch_grad_at_iterate.sapyb(1., self.list_stored_gradients[function_num], -1., out=self.stochastic_grad_difference)          
        
               else: 
        
                   self.stoch_grad_at_iterate.sapyb(1., self.functions[function_num].gradient(self.snapshot), -1., out=self.stochastic_grad_difference)          
        
               # only on gradient randomly selected is seen and appended it to data_passes     
        
               self.data_passes.append(round(self.data_passes[-1] + 1./self.num_functions,4)) 
        
           # full gradient is added to the stochastic grad difference  
        
           if out is None: 
        
               return self.stochastic_grad_difference.sapyb(self.num_functions, self.full_gradient_at_snapshot, 1.) 
        
           else: 
        
               self.stochastic_grad_difference.sapyb(self.num_functions, self.full_gradient_at_snapshot, 1., out=out)

I believe that at the "update full gradient" stage, the full gradient is not is actually not used to update the image. self.stochastic_grad_difference is not changed in this case, and that as that is used as return value, it means that the return value will actually be full_gradient + previous_grad_diff * N`, which is incorrect.

(I think the fact that at the first iteration the current code works is because self.stochastic_grad_difference is initialised as zero).

Instead, I think the update_flag=True case needs to have

        if not out is None:
            out.fill(self.full_gradient_at_snapshot)
            return out
        return self.full_gradient_at_snapshot

pinging @jakobsj @zeljkozeljko @MargaretDuff for confirmation

The text was updated successfully, but these errors were encountered:

MargaretDuff · 2024-05-02T09:47:27Z

@KrisThielemans - do you want to try https://github.com/MargaretDuff/CIL-margaret/tree/svrg. This is the latest branch and the one we will (hopefully) be merging into CIL

KrisThielemans · 2024-05-02T09:53:07Z

That one doesn't have the above logic problem as far as I can see. I have a few minor comments on that function. Where/how do you want me to comment on it?

epapoutsellis · 2024-05-03T08:37:54Z

Still do not understand the problem and if it is a bug but I trust @KrisThielemans . Need to test if it passes cvxpy tests,
https://github.com/epapoutsellis/StochasticCIL/blob/svrg/Wrappers/Python/test/test_SVRGFunction.py
https://github.com/epapoutsellis/StochasticCIL/blob/svrg/Wrappers/Python/test/test_LSVRGFunction.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SVRG update at full gradient too late? #34

SVRG update at full gradient too late? #34

KrisThielemans commented May 2, 2024 •

edited

Loading

MargaretDuff commented May 2, 2024

KrisThielemans commented May 2, 2024

epapoutsellis commented May 3, 2024

SVRG update at full gradient too late? #34

SVRG update at full gradient too late? #34

Comments

KrisThielemans commented May 2, 2024 • edited Loading

MargaretDuff commented May 2, 2024

KrisThielemans commented May 2, 2024

epapoutsellis commented May 3, 2024

KrisThielemans commented May 2, 2024 •

edited

Loading