Remove DictionaryTerm with count 0 during compact (workaround for #374) #376
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I spent a couple hours on this last night as a workaround for #374.
The implementation removes all DictionaryTerm entries with Count=0 from the index, in configurable batches, within a transaction. Originally I did this all in one large transaction, but settled on this approach to avoid locking out other writers for an extended period of time. A batch size of 250 seems like a good default number for a server-based implementation...I'll be using batch sizes of more like 50-100 in my app (running on a single core ARMv7).
I ran this test, that I wrote before submitting #374:
I verified that there were no dictionary term entries present after the test, and the
.ldb
file contained only the document mapping and fields.I also ran a variation of the test above to ensure that documents could still be indexed during the execution of
CompactWithBatchSize
. Just before callingCompactWithBatchSize
, I started another goroutine to index documents during the compact. I verified that the documents were indexed, document count after finishing was correct, and there were no dictionary term entries with count 0.