RerankingEvaluator does not write the csv file when the backward compatible `fit()` method is used. Possible fix suggested. #3062

bluebalam · 2024-11-16T10:43:17Z

sentence-transformers version tested: 3.3 (branch: 3.3-release)
Context: we are migrating a production pipeline that fine-tunes a model from sentence-transformers 2.x to 3.x
In 2.x we used the now deprecated method model.fit() as step toward 3.x we tested the fit() method provided for backward compatibility. However, a RerankingEvaluator we pass during training does not write the csv file with the metrics, as it did before. This file is very useful to us.
Below my notes on the issue.

RerankingEvaluator needs the output_path to be set in order to write the csv file:

sentence-transformers/sentence_transformers/evaluation/RerankingEvaluator.py

Line 138 in ea49e01

if output_path is not None and self.write_csv:

However, if my understanding is correct, in the fit_mixin.py the class EvaluatorCallback calls the evaluator without passing the output_path :

sentence-transformers/sentence_transformers/fit_mixin.py

Line 117 in ea49e01

evaluator_metrics = self.evaluator(model, epoch=state.epoch)

which causes that the file is not written to disk since output_path is None. Note that the flag write_csv: bool is by default set to True, which is the other condition for the RerankingEvaluator to dump the file:

sentence-transformers/sentence_transformers/evaluation/RerankingEvaluator.py

Line 138 in ea49e01

if output_path is not None and self.write_csv:

.

I think a possible fix would be to modify the call to the evaluator in fit_mixin.py from

evaluator_metrics = self.evaluator(model, epoch=state.epoch)

to

evaluator_metrics = self.evaluator(model, epoch=state.epoch, output_path=args.output_dir)

I do understand this method is provided only to support the transition and it will be removed, but I think it would be nice to have it working as expected in the meantime :) .

The text was updated successfully, but these errors were encountered:

tomaarsen · 2024-11-18T09:26:43Z

Hello!

I do understand this method is provided only to support the transition and it will be removed, but I think it would be nice to have it working as expected in the meantime :) .

I totally agree, there's a good reason that I added the fit() backwards compatibility support, specifically for cases like yours where people transition from 2.x to 3.x. I obviously do recommend eventually switching to the new Trainer - it's simply more powerful, but I'll try and fix this.

I believe your proposed fix does not work equivalently as before, because the output_dir in the args is set to checkpoint_path rather than the provided output_path. I see that the old model.fit() created an eval folder under the output_path and used that as the output_path in the evaluator.

Tom Aarsen

tomaarsen mentioned this issue Nov 18, 2024

[training] Pass steps/epoch/output_path to Evaluator during training #3066

Merged

tomaarsen closed this as completed in #3066 Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RerankingEvaluator does not write the csv file when the backward compatible `fit()` method is used. Possible fix suggested. #3062

RerankingEvaluator does not write the csv file when the backward compatible `fit()` method is used. Possible fix suggested. #3062

bluebalam commented Nov 16, 2024 •

edited

Loading

tomaarsen commented Nov 18, 2024

RerankingEvaluator does not write the csv file when the backward compatible fit() method is used. Possible fix suggested. #3062

RerankingEvaluator does not write the csv file when the backward compatible fit() method is used. Possible fix suggested. #3062

Comments

bluebalam commented Nov 16, 2024 • edited Loading

tomaarsen commented Nov 18, 2024

RerankingEvaluator does not write the csv file when the backward compatible `fit()` method is used. Possible fix suggested. #3062

RerankingEvaluator does not write the csv file when the backward compatible `fit()` method is used. Possible fix suggested. #3062

bluebalam commented Nov 16, 2024 •

edited

Loading