Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binaryHeader: fixed "invalid size" error with non existing label value #2386

Conversation

mkabischev
Copy link

@mkabischev mkabischev commented Apr 6, 2020

Fixes #2213

There is a problem in sort.Search usage that causes floating error when try to get offsets for non existing combination of name & values.

@mkabischev mkabischev force-pushed the fix-posting-offset-search branch from b58dc57 to f6293cc Compare April 6, 2020 17:58
break
}
if i > 0 && e.offsets[i].value != value {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need this (:

Copy link
Author

@mkabischev mkabischev Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Really I didn't get this logic.
if e.offsets[i].value != value than means that we didn't find value and have to do nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we decrement the i then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like This is still not clear, I think what would help is some extra unit test to verify

Copy link
Author

@mkabischev mkabischev Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to decrement it? I mean that when there is no value found then we need break.

Copy link
Member

@bwplotka bwplotka Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole idea behind this code is to find matching offset OR the largest offset that is before wanted value. Then we iterate one by one to find exact offset.

This is because binary index hold in memory only 1/32 of offsets (to not keep all in memory). Problem is that our tests have only 3 postings most likely that's why we test nothing..

It's extremely concerning that our tests passed given that this version will literally skip 31/32 postings ((:

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this but I think you missed important information that we are this offsets hold only 1/32 of posting offsets referenced by name = value. This means that if e.offsets[i].value != value we really need to iterate over 32 offsets from the actual mapped file. This is super scary that we cannot both:

  • Reproduce your issue in unit test
  • Fail unit test on such clear broken logic )))):

We need a test, let's focus on this tomorrow. The bug must be somewhere here, but this change is not fixing it yet. Let's dig dipper tomorrow! 🤗

break
}
if i > 0 && e.offsets[i].value != value {
Copy link
Member

@bwplotka bwplotka Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole idea behind this code is to find matching offset OR the largest offset that is before wanted value. Then we iterate one by one to find exact offset.

This is because binary index hold in memory only 1/32 of offsets (to not keep all in memory). Problem is that our tests have only 3 postings most likely that's why we test nothing..

It's extremely concerning that our tests passed given that this version will literally skip 31/32 postings ((:

@mkabischev
Copy link
Author

We found the root cause of error in another place, so close this PR. Fix will be in another one.

@mkabischev mkabischev closed this Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants