Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] locking issue in cache.go fatal error: concurrent map writes #2743

Open
mcarbonne opened this issue Dec 18, 2024 · 0 comments
Open

[BUG] locking issue in cache.go fatal error: concurrent map writes #2743

mcarbonne opened this issue Dec 18, 2024 · 0 comments

Comments

@mcarbonne
Copy link

Describe the bug

When running the latest version of OCIS, a SW relying on go-micro, I have some crashes (owncloud/ocis#10785)

The backtrace look like this (I extracted only the relevant part):

fatal error: concurrent map writes
fatal error: concurrent map writes

goroutine 13004 [running]:
go-micro.dev/v4/registry/cache.(*cache).isValid(0xc003746c60, {0xc0099c38c0, 0x1, 0x16?}, {0xc009750fa0?, 0x0?, 0x5f85c8271d40?})
        go-micro.dev/[email protected]/registry/cache/cache.go:103 +0x1aa
go-micro.dev/v4/registry/cache.(*cache).get(0xc003746c60, {0x5f85c3fe03b1, 0x16})
        go-micro.dev/[email protected]/registry/cache/cache.go:145 +0x136
go-micro.dev/v4/registry/cache.(*cache).GetService(0x5f85c3fe03b1?, {0x5f85c3fe03b1?, 0x5f85c5856f90?}, {0x1?, 0x0?, 0xc00a766b80?})
        go-micro.dev/[email protected]/registry/cache/cache.go:462 +0x18
github.com/cs3org/reva/v2/pkg/rgrpc/todo/pool.(*Selector[...]).Next(0x5f85c6241710, {0x0, 0x0, 0x16})
        github.com/cs3org/reva/[email protected]/pkg/rgrpc/todo/pool/selector.go:117 +0x2b4

I did some research and in cache.go I've the following code:

func (c *cache) get(service string) ([]*registry.Service, error) {
	// read lock
	c.RLock()

	// check the cache first
	services := c.cache[service]
	// get cache ttl
	ttl := c.ttls[service]
	// make a copy
	cp := util.Copy(services)

	// got services, nodes && within ttl so return cache
	if c.isValid(cp, ttl) {
		c.RUnlock()
		// return services
		return cp, nil
	}

From what I understood, this function take a "read only lock" but isValid need a RW lock, because of this line:

delete(c.nttls, s.Name)

The bug might have been introduced with this PR #2736

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant