Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/sync: allocate memory from sync.Pool instead of Chunk.NewOffPage to avoid stall problem #5497

Merged

Conversation

YinhaoHu
Copy link
Contributor

@YinhaoHu YinhaoHu commented Jan 2, 2025

close #5492

All unit tests related to this update are adjusted and passed.

The case mentioned in the issue can also run now.

@CLAassistant
Copy link

CLAassistant commented Jan 2, 2025

CLA assistant check
All committers have signed the CLA.

@YinhaoHu YinhaoHu closed this Jan 2, 2025
@YinhaoHu YinhaoHu deleted the fix/assignable-buffer-size-in-sync branch January 2, 2025 16:46
@YinhaoHu YinhaoHu restored the fix/assignable-buffer-size-in-sync branch January 2, 2025 16:50
@YinhaoHu YinhaoHu reopened this Jan 2, 2025
@YinhaoHu YinhaoHu force-pushed the fix/assignable-buffer-size-in-sync branch from 9872447 to 913e5a9 Compare January 2, 2025 16:54
@zhijian-pro
Copy link
Contributor

@YinhaoHu Thank you very much for your contribution, but we think the following approach would be more appropriate.
The issue lies in our sync download logic where we shouldn't be using chunk.NewOffPage to allocate memory.

p := chunk.NewOffPage(int(size))

In this line of code
downer := newParallelDownloader(src, key, size, 10<<20, concurrent)

we have hard-coded a block size of 10M, so we can directly use a fixed-size sync.pool to allocate memory.

@YinhaoHu YinhaoHu force-pushed the fix/assignable-buffer-size-in-sync branch from 913e5a9 to 87c1986 Compare January 6, 2025 13:32
@YinhaoHu YinhaoHu changed the title Introduce Buffer Size Configuration and Warning Logs for Sync Optimization Allocate memory from sync.Pool instead of Chunk.NewOffPage to avoid sync stall problem Jan 6, 2025
@YinhaoHu
Copy link
Contributor Author

YinhaoHu commented Jan 6, 2025

@zhijian-pro Good idea. I have updated this PR based on the more appropriate approach.

@YinhaoHu YinhaoHu force-pushed the fix/assignable-buffer-size-in-sync branch from 87c1986 to 0e12449 Compare January 6, 2025 13:40
@zhijian-pro
Copy link
Contributor

zhijian-pro commented Jan 7, 2025

It is best to adjust the TestDownload unit test so that it passes

@YinhaoHu YinhaoHu force-pushed the fix/assignable-buffer-size-in-sync branch from 0e12449 to 6a7a3e7 Compare January 7, 2025 09:52
@YinhaoHu YinhaoHu requested a review from zhijian-pro January 7, 2025 09:52
@YinhaoHu YinhaoHu force-pushed the fix/assignable-buffer-size-in-sync branch from 6a7a3e7 to c28247d Compare January 7, 2025 10:04
@YinhaoHu
Copy link
Contributor Author

YinhaoHu commented Jan 7, 2025

It is best to adjust the TestDownload unit test so that it passes

All test cases in the TestDownload unit test are modified in an equivalent approach as before. And they all pass.

@zhijian-pro
Copy link
Contributor

@YinhaoHu If the size of the last block is smaller than the default block size, it should not be put back into the pool, otherwise it will cause panic when you get it.

@YinhaoHu
Copy link
Contributor Author

YinhaoHu commented Jan 7, 2025

@YinhaoHu If the size of the last block is smaller than the default block size, it should not be put back into the pool, otherwise it will cause panic when you get it.

The capacity of each buf is always downloadBufSize even if the length is variable, maybe smaller than the default block size, and there will be no block requires capacity more than downloadBufSize. I don't think panic will occur. Or could you show me a case where the panic will occur?

@zhijian-pro
Copy link
Contributor

The last block returned when file A is downloaded may be removed from the pool when file B is downloaded.
But this block may be less than the length of downloadBufSize, and the B file wants it to be downloadBufSize.

@YinhaoHu
Copy link
Contributor Author

YinhaoHu commented Jan 8, 2025

The last block returned when file A is downloaded may be removed from the pool when file B is downloaded. But this block may be less than the length of downloadBufSize, and the B file wants it to be downloadBufSize.

The New function of downloadBufPool is:

New: func() interface{} {
		buf := make([]byte, 0, downloadBufSize)
		return &buf
	},

And the capacity is always downloadBufSize. In your case, the buf might be sliced to length downloadBufSize/2 and returned to the pool after file A is downloaded. The buf would be got again for downloading file B. It is OK to slice it to length downloadBufSize which is not greater than its capacity. If I have got your point, you're meaning the code case below which does not panic:

const downloadBufSize = 10 << 20

var downloadBufPool = sync.Pool{
	New: func() interface{} {
		buf := make([]byte, 0, downloadBufSize)
		return &buf
	},
}

func Main() {
	downloadFileA()
	downloadFileB()
}

func downloadFileA() {
	buf := downloadBufPool.Get().(*[]byte)
	*buf = (*buf)[:downloadBufSize/2]
	// read file
	printBuf("A", buf)
	downloadBufPool.Put(buf)
}

func downloadFileB() {
	buf := downloadBufPool.Get().(*[]byte)
	*buf = (*buf)[:downloadBufSize]
	// read file
	printBuf("B", buf)
	downloadBufPool.Put(buf)
}

func printBuf(name string, buf *[]byte) {
	fmt.Printf("%v: len = %v, cap = %v\n", name, len(*buf), cap(*buf))
}

Output:

A: len = 5242880, cap = 10485760
B: len = 10485760, cap = 10485760

@zhijian-pro
Copy link
Contributor

OK,I missed *p = (*p)[:size].

@jiefenghuang jiefenghuang changed the title Allocate memory from sync.Pool instead of Chunk.NewOffPage to avoid sync stall problem cmd/sync: allocate memory from sync.Pool instead of Chunk.NewOffPage to avoid stall problem Jan 8, 2025
@jiefenghuang jiefenghuang merged commit 0f14ee1 into juicedata:main Jan 8, 2025
35 checks passed
@YinhaoHu YinhaoHu deleted the fix/assignable-buffer-size-in-sync branch January 8, 2025 13:22
jiefenghuang pushed a commit that referenced this pull request Jan 20, 2025
jiefenghuang pushed a commit that referenced this pull request Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync Process Stalls with High Thread Count Due to Buffer Limitations
4 participants