-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to push avro schemas/protocols such that it registry uses references and doesn't embed everything into definition #5573
Comments
Thank you for reporting an issue! Pinging @jsenko to respond or triage. |
What are your expectations for how your schemas are laid out locally? Do you have control over that such that you could maintain some extra metadata? Another place to look for this type of thing is in our maven plugin. Especially this part: What do you mean precisely when you say this?
I'd like to better understand your use-case/goals. There is not currently a way to send a bunch of related stuff to registry all at once and have it automatically figure out the details (with references and all that). It is something we've discussed, but not yet implemented. Could be an opportunity to collaborate on something like that if you are interested. |
Regarding "extra metadata" yes we have full control over whatever we are doing with avro, i assume you mean metadata in the message definitions themselves? The use case is like this: in production we do not want to let different services to push their message definitions at will, which means schema registry will have to be in its correct state (with all versions of schemas) when its running, there are several reasons for that.
It all basically boils down to us wanting to have deterministic behaviour with registry, if we are rolling back we can just grab an older image and know that a service that shouldnt have known about v4 of some message wont magically figure it out because it was pushed in registry earlier... we can simply wipe the db, deploy the specific docker image and have exact same behaviour. in dev/qa we let devs go wild and push however they like, in prod services cant do that, messages should be preloaded by the docker image at stratup. Would be happy to collaborate, i think all of my problems are solved if we can come up with a new rest endpoint that lets us import AVDL or AVPR files. I'll be glad to contribute as well, just need some pointers on how you'd approach this. P.S. the unions not working was because we were importing all types directly and not using references, once i let kafka clients register in an empty registry all unions worked fine. |
We have been planning on adding gitops support for some time now. Would that help solve the issue? If the data has to be baked into an image, I can think of a variant of gitops with the "repository" baked into the image. |
Hi @lsegv thank you for the additional context. Much appreciated, and sorry for the delay in responding. We've being trying to get a release out the door. 😓 Admittedly we have not previously considered a use case that bundled the artifacts into a container image. But it's an interesting idea, for the reasons you've given - essentially release immutability. Registry already has a feature that might be leveraged to achieve part of what you want to do: The (Side note: the read-only property is dynamic, which means it can be set via ENV var but also can be changed at runtime via REST API call. However, the latter capability can also be configured, resulting in a registry that is read-only on startup and is unable to be changed). Using this feature would require all of the artifacts to be bundled into a .zip file of the proper format (the format created by Registry during an "export" operation). Creating the .zip file then becomes the challenge. |
Can you explain this a bit more (or maybe link to some required reading)? I confess to not being as much of an expert on Avro as I could be. |
Could you also please explain this part a bit more?
|
I'm not sure how the gitops solution you describe works but basically anything that would allow us to have deterministic images (effectively snapshots) of what message versions were at a specific point would be great. @EricWittmann right now as POC what we do is the docker image is built on top of SR 3.0.0, it copies over schema files and uses a script to call the rest api and import all the schemas... but as i said the problem is those schemas are inlined, they are not using references (which is why i believe the unions do not work)... and yes once it loads all the schemas it uses the REST api to close publishing over rest. For the near future i'm going to simply use on demand push of message types while we integrate SR into existing mess. But in the long run i have to solve the problem of creating immtuable versions of SR based on given .avdl files. The content in the SR has to be in the same shape as it would be when you run a real application (e.g. use references instead of completely independent messages that inline any references). I'll take a look at ImportLifecycleBean see how it can be used. Jsenko i'll show in the next reply with some screenshots... |
Thanks for all of this additional information and the examples and POC! This is very helpful to understand what you're trying to do. We'll discuss this as soon as we get a chance. 👍 |
Btw the class
One of the messages would not have the package. |
I want to build a custom registry image based on apicurio, this image would preload all my schemas enforcing them (since i will disable pushing artifacts by services). There is one big problem though.
If i simply loop over all my AVSC files (that i generate from AVDL) and upload them one by one then registry does no processing on top of them and does not recognise when 2 messages were imported reused, because the content of AVSC is literally inlined to be self contained.
I started looking for api that would allow me to upload protocol files AVDL or AVPR, but i found no such thing.
When i use a kafka producer/consumer example and let it push the definitions on the go i see that this pushes messages properly (e.g. uses references), but actual code that does this does a lot of work, it figures out all those references and uploads them correctly to registry.
Problem is i cant just let arbitrary java code run during build stage (or at least i dont want to), ideally registry should allow for importing a protocol and then properly store all those definitions and references.
What can i do here other than grabbing the code from kafka serializer? maybe there is an api i do not know about?
The text was updated successfully, but these errors were encountered: