Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasserstein implementation does not seem to be fully "batched" #10

Open
netw0rkf10w opened this issue Sep 15, 2021 · 1 comment
Open

Comments

@netw0rkf10w
Copy link

Hi @t-vi,

Thanks for sharing your code!

I would like to ask a question regarding your implementation of the Sinkhorn algorithm. You stated that one of the main motivations was to obtain efficient batched computation. However, looking at the code I observe that it only supports the case where the cost matrix is the same across the batch:

def forward(ctx, mu, nu, dist, lam=1e-3, N=100):
        assert mu.dim() == 2 and nu.dim() == 2 and dist.dim() == 2
        bs = mu.size(0)
        d1, d2 = dist.size()
        assert nu.size(0) == bs and mu.size(1) == d1 and nu.size(1) == d2

That is, the shape dist is d1 x d2 instead of bs x d1 x d2. Is this expected?

Thank you in advance for your reply.

@netw0rkf10w
Copy link
Author

I also observe that the backward pass does not compute the gradient for dist:

@staticmethod
    def backward(ctx, grad_out):
        return grad_out[:, None] * ctx.log_u * ctx.lam, grad_out[:, None] * ctx.log_v * ctx.lam, None, None, None

which unfortunately prevents learning dist...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant