Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[full-ci] Use nats-js cache #7793

Merged
merged 5 commits into from
Dec 14, 2023
Merged

[full-ci] Use nats-js cache #7793

merged 5 commits into from
Dec 14, 2023

Conversation

kobergj
Copy link
Collaborator

@kobergj kobergj commented Nov 23, 2023

Fixes and activates the nats-js cache

Fixes #7049

@kobergj kobergj force-pushed the FixNatsjsCache branch 3 times, most recently from c2f9a2f to 0ca2c79 Compare November 23, 2023 14:26
@kobergj kobergj marked this pull request as draft November 23, 2023 14:26
@kobergj kobergj force-pushed the FixNatsjsCache branch 11 times, most recently from 74b51a2 to 42755b2 Compare November 29, 2023 14:41
@kobergj kobergj force-pushed the FixNatsjsCache branch 9 times, most recently from 927f9bc to 62d5996 Compare December 5, 2023 14:21
@kobergj kobergj force-pushed the FixNatsjsCache branch 7 times, most recently from 7837c80 to b59ffbd Compare December 7, 2023 11:04
@@ -64,6 +64,8 @@ import (
var (
// runset keeps track of which services to start supervised.
runset map[string]struct{}
// time to wait after starting the preliminary services
_preliminaryDelay = 6 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we need to wait for nats to get the registry going? Does that mean that the micro registry and client / selector implementations die if they cannot talk to the actual registry? How is that supposed to work in a microservice architecture? Shouldn't services kind of assemble gracefully? Why is our built in supervisor so different than a kubernetes cluster?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a good point, but I don't know how to avoid that for the moment.

In a perfect world they would assembly gracefully. But there is no such mechanism implemented for the registry (neither for the cash). If services can't connect to the registry they die immediately. Additionally, if services die during initialization of the single binary, they will not come up again (at least this what I observed in various runs).

Is the single binary supposed to restart services during initialization? Wouldn't that lead to infinite restarts when the services are misconfigured?

Why is our built in supervisor so different than a kubernetes cluster?

This question I don't understand. How could our selfmade supervisor be like a kubernetes cluster? Is it supposed to be?

@kobergj
Copy link
Collaborator Author

kobergj commented Dec 12, 2023

Open question:

Should we use the natsjs cache by default? Doesn't make much sense for the single binary scenario imho. On the other hand we might want to test it? Maybe switch to memory by default but use natsjs for the ci pipeline?

@butonic @micbar

@kobergj
Copy link
Collaborator Author

kobergj commented Dec 13, 2023

Decided to use memory cache by default and create a separate pipeline to test it. @butonic could you have another look? Seems all ready from side now...

@kobergj kobergj requested a review from butonic December 13, 2023 11:51
@kobergj kobergj marked this pull request as ready for review December 14, 2023 09:18
Copy link
Member

@butonic butonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should separate the filemetadata caches of the storageproviders

@kobergj
Copy link
Collaborator Author

kobergj commented Dec 14, 2023

Mmh. Not sure if this works. Aren't they invalidating each other? The pipeline will tell us...

@butonic
Copy link
Member

butonic commented Dec 14, 2023

Mmh. Not sure if this works. Aren't they invalidating each other? The pipeline will tell us...

No, the filemetadata cache is only shared between the storageprovider and the dataprovider, which are runnign in the same service. The old stat cache was shared between gateway and storageproviders, but redis got clogged because we have to invalidate entries too often and the listing of keys via SCAN was too slow. Which is why we got rid of the 'cross service stat cache'. We should kill that code ...

So, the storage-usets and storage-system caches are two completely distinct caches. We should configure the correspondingly.

Co-authored-by: Jörn Friedrich Dreyer <[email protected]>
@@ -85,7 +85,7 @@ type StorageRegistry struct {

// Cache holds cache config
type Cache struct {
StatCacheStore string `yaml:"stat_cache_store" env:"OCIS_CACHE_STORE;GATEWAY_STAT_CACHE_STORE" desc:"The type of the cache store. Supported values are: 'memory', 'ocmem', 'etcd', 'redis', 'redis-sentinel', 'nats-js', 'noop'. See the text description for details."`
StatCacheStore string // NOTE: The stat cache is not working atm. Hence we block configuring it
Copy link
Contributor

@mmattel mmattel Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kobergj Note that this also needs a doc change, see:
admin docs and dev-docs. Pls update the gateway readme accordingly, I will take care on admin docs afterwards. Note, you maybe only comment out teh block to ease reenabling it when fuctional again... (I will do that on my side)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue tackled here: #7977

Copy link

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

2 New issues
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

@kobergj kobergj merged commit 7c3f51a into owncloud:master Dec 14, 2023
3 checks passed
@kobergj kobergj deleted the FixNatsjsCache branch December 14, 2023 10:50
ownclouders pushed a commit that referenced this pull request Dec 14, 2023
@micbar micbar mentioned this pull request Dec 20, 2023
71 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nats-js cache not working
3 participants