Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yugabyte/service: parallel batch inserts for car indices #1795

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

nonsense
Copy link
Member

@nonsense nonsense commented Nov 1, 2023

This PR builds on top of the work @ischasny did with batch inserts for indices.

It basically splits the recs slice (which can get from 0 up to 50M-100M entries), into 32 chunks that get processed in parallel.

Considering that batch size is 10k, it makes sense to have concurrent go-routines working on inserting the index.

@nonsense nonsense requested review from ischasny and LexLuthr November 1, 2023 15:07
@nonsense nonsense force-pushed the nonsense/parallel-batch-inserts-for-indices branch from 317b9b1 to f6fb85c Compare November 1, 2023 15:11
@nonsense nonsense marked this pull request as ready for review November 1, 2023 15:11
if batch == nil {
batch = s.session.NewBatch(gocql.UnloggedBatch).WithContext(ctx)
batch.Entries = make([]gocql.BatchEntry, 0, s.settings.InsertBatchSize)
threadBatch := len(recs) / 32 // split the slice into go-routine batches for ~32 workers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we take an opportunity and extract batching / parallelisation logic into a separate function as it used twice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise lgtm

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am merging this as is for now. We can shorten the code for next release.

@LexLuthr LexLuthr merged commit 1622d0c into main Nov 1, 2023
@LexLuthr LexLuthr deleted the nonsense/parallel-batch-inserts-for-indices branch November 1, 2023 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants