Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NATS registry / cache / store #7272

Closed
1 task
wkloucek opened this issue Sep 12, 2023 · 25 comments
Closed
1 task

NATS registry / cache / store #7272

wkloucek opened this issue Sep 12, 2023 · 25 comments
Assignees
Labels
Interaction:Minor-Release Type:Epic Epic is the parent of user stories
Milestone

Comments

@wkloucek
Copy link
Contributor

wkloucek commented Sep 12, 2023

User Story

As a SaaS provider i need to support a scalable deployment for my registry / cache / stores to be able to perform under changing load.

Acceptance Criteria

  • Implemented go-micro plugins for Nats-JS KV Store

Is your feature request related to a problem? Please describe.

As a user I want to have as less components as possible. I would love to use NATS as registry / cache / store. Currently I have to use different components.

Describe the solution you'd like

Have a performant NATS registry / cache / store implementation for the KV feature based on NATS Jetstream.

Have it loadtested, it should distribute load, have sufficient speed, be stable / highly available, delete unneeded data (retention).

We also should think about dropping offical support of other registries (etcd, consul, memory, mdns, kubernetes) and caches /stores (redis, redis-sentinel, noop, memory, ocmem) implementations since many of them are only usable in a limited deployment range and / or not battle tested. Currently official documentation lists them all, so I understand them as officially supported.

Describe alternatives you've considered

Additional context

Other known NATS topics:

@wkloucek
Copy link
Contributor Author

Was discussed at the Hack-Week by:

@dj4oC
Copy link
Contributor

dj4oC commented Nov 4, 2023

@tbsbdr please schedule for next sprint since this is blocking further growth including the other NATS related issues mentioned by @wkloucek above

@micbar
Copy link
Contributor

micbar commented Nov 4, 2023

Already worked on, see status.

@dj4oC
Copy link
Contributor

dj4oC commented Nov 5, 2023

Already worked on, see status.

true. thanks for the spotlight. but there is more to do, right?

@micbar micbar closed this as completed Nov 21, 2023
@github-project-automation github-project-automation bot moved this from In progress to Done in Infinite Scale Team Board Nov 21, 2023
@wkloucek
Copy link
Contributor Author

Is this really fulfilled?

We now have a nats-js registry. But what about the cache?

@wkloucek wkloucek reopened this Nov 21, 2023
@github-project-automation github-project-automation bot moved this from Done to In progress in Infinite Scale Team Board Nov 21, 2023
@kobergj
Copy link
Collaborator

kobergj commented Nov 21, 2023

@wkloucek isn't the cache already using nats-js store? (The nats-js store was already using the key-value store interface of jetstream. Only the registry implementation was not.)

@wkloucek
Copy link
Contributor Author

wkloucek commented Nov 21, 2023

@wkloucek isn't the cache already using nats-js store? (The nats-js store was already using the key-value store interface of jetstream. Only the registry implementation was not.)

My last info is that the cache does not work. See also #7049

But there is also more than just a working cache / store / registry implementation when looking at all the linked tickets. We please need to clarify all operational questions. Can I use a memory backed stream? Who is responsible for creating streams? Who is responsible for configuring stream replicas. Are we clean when it comes to retention. Are we using the KV store / cache in a performant way?

@wkloucek
Copy link
Contributor Author

Eg. the registry could also be a memory backed stream if that has advantages

@kobergj
Copy link
Collaborator

kobergj commented Nov 22, 2023

I see. I wasn't aware of #7049 Seems like a standard panic. I'll take a look.

Regarding the other questions. I have no clue :) Should we have another meeting where we discuss where we stand and what needs to be done?

@wkloucek
Copy link
Contributor Author

Regarding the other questions. I have no clue :) Should we have another meeting where we discuss where we stand and what needs to be done?

To be honest since #7272 (comment) nothing really changed. Those questions still need a answer (and modified code if needed). For that it might be helpful to read NATS (Jetstream) documentation. I already read parts of it and can be there as a sparring partner. But in general it makes sense to have a NATS "expert" in the oCIS development team since it's a really crucial part of oCIS.

@kobergj
Copy link
Collaborator

kobergj commented Nov 22, 2023

Not so much fan of the "expert" pattern. I would prefer everybody in the team to know about nats (jetstream) as it is the backbone of the system.

But still I am uncertain what still needs to be done and where the biggest pain points are. Your questions in #7272 (comment) more sound like a "how do we want to do it" then "how do we have to do it" questions.

I'm happy to drive natsjs improvements. I just don't know where to start.

@wkloucek
Copy link
Contributor Author

Not so much fan of the "expert" pattern. I would prefer everybody in the team to know about nats (jetstream) as it is the backbone of the system.

Also fine for me. But probably one person needs to go ahead since we can't dedicate the full team to reading documentation for 2 days, right?

But still I am uncertain what still needs to be done and where the biggest pain points are. Your questions in #7272 (comment) more sound like a "how do we want to do it" then "how do we have to do it" questions.

I'm happy to drive natsjs improvements. I just don't know where to start.

A first questions would be eg. #7119:
Am I allowed to use memory streams? If so, how can I configure them? The ticket already talks about benefits of memory streams (see benchmark) but also about the problem when currently trying to use memory streams (immutable).

Next question: is the new registry implementation actually distributing load? The nats registry didn't do that from what I know (see #7188)

@kobergj
Copy link
Collaborator

kobergj commented Nov 22, 2023

Oki.

@wkloucek
Copy link
Contributor Author

I added another NATS topic which could really help for our SaaS: #7801

@wkloucek
Copy link
Contributor Author

Seems like the natsjs registry triggers some excessive logging on the NATS side: #7948

@micbar
Copy link
Contributor

micbar commented Jan 22, 2024

@kobergj @wkloucek We need to check the status of the NATs implementation please.

@micbar micbar added this to the Release 5.0.0 milestone Jan 22, 2024
@fschade
Copy link
Contributor

fschade commented Jan 24, 2024

@kobergj closable?

@wkloucek
Copy link
Contributor Author

wkloucek commented Jan 25, 2024

What we identified during that status meeting:

#7231 (comment)

#7245 (comment)

#7023 -> not yet implemented but also not pressing

and one cache was still on file storage instead on memory storage 🤔

@kobergj
Copy link
Collaborator

kobergj commented Jan 25, 2024

#7231 (comment)

Will look into that today

#7245 (comment)

This is just changing default values. Should we do that for the single binary too?

#7023

This needs to be tackled with a followup ticket

and one cache was still on file storage instead on memory storage 🤔

No, not a cache. It was the registry. This is already fixed with #8236

@micbar
Copy link
Contributor

micbar commented Jan 25, 2024

#7245 (comment)

This is just changing default values. Should we do that for the single binary too?

Please do so, yes.

@wkloucek
Copy link
Contributor Author

No, not a cache. It was the registry. This is already fixed with #8236

Thanks for keeping that information safe! I already forgot about it.

@dj4oC
Copy link
Contributor

dj4oC commented Jan 25, 2024

Please don't forget https://github.com/owncloud/enterprise/issues/6354

@micbar micbar added Type:Epic Epic is the parent of user stories and removed Type:Story User Story labels Jan 26, 2024
@wkloucek
Copy link
Contributor Author

Discovered during another review:

  • main-queue maxAge 168h
  • KV_cache-userinfo maxAge could be higher, but invalidation / extra validation need -> @kobergj will create a extra ticket
  • KV_postprocessing, KV_ids-storage-users and KV_storage-users maxAge 168h
  • KV_service-registry maxAge 60s

@kobergj
Copy link
Collaborator

kobergj commented Jan 26, 2024

KV_cache-userinfo maxAge could be higher, but invalidation / extra validation need -> @kobergj will create a extra ticket

#8297

@kobergj
Copy link
Collaborator

kobergj commented Jan 26, 2024

Guess we tackled all tickets here. I'll close this one for now.

@kobergj kobergj closed this as completed Jan 26, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in Infinite Scale Team Board Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interaction:Minor-Release Type:Epic Epic is the parent of user stories
Projects
Archived in project
Development

No branches or pull requests

5 participants