Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created CoSENTLoss.py #2454

Merged
merged 10 commits into from
Jan 31, 2024
Merged

Created CoSENTLoss.py #2454

merged 10 commits into from
Jan 31, 2024

Conversation

johneckberg
Copy link
Contributor

PR Overview

Details

  • Implemented CoSENT loss as described in this blogpost and the AnglE repo.
  • This loss follows the same input format as CosineSimilarityLoss, requiring each input to contain a pair of sentences and a floating point label, allowing for applications beyond two-category data. To use this loss with MNRLoss, like done here in the AnglE repo, an internal method must be used. I can't think of a way to use a different input format without assumptions about the labels, which would limit the applications beyond two-category data. If anyone has any ideas, take a look at the code and leave a comment!
  • Subtraction by 1e^12 is how irrelevant values in the scores matrix become negligible in the logsumexp. I believe this is the constant used by the original implementation, and no other implementations have seen a need to change it. I agree, and for consistency have also used 1e^12.
  • The example in the doc string uses InputExamples, not a Dataset/DatasetDict, which will need to change in the future.

@tomaarsen
Copy link
Collaborator

tomaarsen commented Jan 29, 2024

Hello!

I've ran various scripts that use CosineSimilarityLoss with both that loss & CoSENT, and the Spearman correlation coefficient for cosine similarity against the STSBenchmark test set almost universally shows improvements for CoSENT versus CosineSimilarityLoss.
This is very exciting! I'll be diving into the exact implementation details & blogpost soon.

  • train_sts_indomain_semantic.py
    • Cosine: 84.21 Spearman correlation based on cosine similarity
    • CoSENT: 85.47 Spearman correlation based on cosine similarity
  • training_multi-task.py
    • Cosine: 81.51 Spearman correlation based on cosine similarity
    • CoSENT: 82.41 Spearman correlation based on cosine similarity
  • training_stsbenchmark_continue_training.py
    • Cosine: 86.35 Spearman correlation based on cosine similarity
    • CoSENT: 87.01 Spearman correlation based on cosine similarity
  • training_stsbenchmark.py
    • Cosine: 79.18 Spearman correlation based on cosine similarity
    • CoSENT: 84.25 Spearman correlation based on cosine similarity

Code looks good too! Seems to match the other implementations.

As for your comments, it makes sense to consider this a drop-in replacement of the CosineSimilarityLoss, and thus support that input format (anchor-sentence pairs with a similarity score label). The 1e12 choice is also good - I'd like to stick to the official implementation.

  • Tom Aarsen

@tomaarsen
Copy link
Collaborator

I think this might already be ready for merging! What do you think @johneckberg?

@johneckberg
Copy link
Contributor Author

@tomaarsen Go ahead! If you think that the documentation is clear and easy enough to follow, then I also think it's ready for merging.

Copy link
Collaborator

@tomaarsen tomaarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added CoSENTLoss to the docs & updated the formatting slightly. I also mentioned my experiments where CoSENTLoss outperformed CosineSimilarityLoss.


cc: @bojone You might be interested in this work; we're impressed with the performance of CoSENT, and would like to implement it into Sentence Transformers to improve the accessibility of your loss function.

  • Tom Aarsen

sentence_transformers/losses/CoSENTLoss.py Outdated Show resolved Hide resolved
Updated documentation to match description in original article
@bojone
Copy link

bojone commented Jan 30, 2024

@tomaarsen Wonderfull! I'm very happy to see CoSENT receiving further recognition.

@tomaarsen tomaarsen merged commit 9df8c43 into UKPLab:master Jan 31, 2024
9 checks passed
@tomaarsen
Copy link
Collaborator

Very excited to see this loss function merged. I'll be experimenting with it myself!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement CoSENT loss
3 participants