Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix a bug for set_difference() #178

Merged
merged 1 commit into from
Aug 17, 2022

Conversation

wenqiang-gu
Copy link
Member

@brettviren I found that, in the function Aux::untagged_traces(), the std::set_difference was given a std::unordered_set. However, the elements are actually assumed to be sorted in the std according to cppreference.

Here is a quick test:

  std::vector<size_t> all={0,1,2,3};

  std::unordered_set<size_t> tagged={0,1,2,3};  // fail
  // std::set<size_t> tagged={0,1,2,3};         // success, but maybe not recommended?
  // std::vector<size_t> tagged={0,1,2,3};      // success

  std::vector<size_t> untagged;
  std::set_difference(all.begin(), all.end(), tagged.begin(), tagged.end(), std::inserter(untagged, untagged.begin()));

Note that untagged_traces is used in Drifter, ChannelSelector, ManifySInk and FrameMerger. Given that most of our use cases are just retrieving all traces when the upstream simulation/data has no tag for traces. Therefore, this bug should not make an actual difference.

It makes a difference only when the upstream frame does have tagged traces, but we still want to retrieve some untagged traces. As an example in the above test, we have traces 0,1,2,3 and all of them are tagged. If we implement it properly, there should be no untagged traces, but the bug would give us untagged traces 0,1,2,3.

@brettviren
Copy link
Member

Good catch, @wenqiang-gu !

I think I was originally worried about the loop over tags producing duplicate entries in the tagged set. But, if I read std::set_difference() correctly, any duplicates in tagged should not matter (maybe the check goes slightly slower) so your fix seems good to me.

@brettviren brettviren merged commit b6c6270 into WireCell:master Aug 17, 2022
@HaiwangYu
Copy link
Member

Hi @brettviren and @wenqiang-gu, I do have a question. Are we sure the tagged is ordered? Otherwise, we may need to sort it first.
test: https://onlinegdb.com/IBdf7AeRt

@brettviren
Copy link
Member

Oh, good question. It might be accidentally true but best to be defensive.

Either a std::sort() or the use of a (sorted) std::set would be good. @wenqiang-gu sorry to miss this and merge so quickly. Can you please add it?

@wenqiang-gu wenqiang-gu mentioned this pull request Aug 18, 2022
@wenqiang-gu
Copy link
Member Author

@brettviren I have made a new PR here: #179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants