-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
case: deploying models for inference #2431
Comments
So is the main difference with #862 in how the model is built and on how to setup a real time prod env?
Can the regular dvc.api suffice for now? |
OK, I see that paragraph has the DVC-related questions. But I'm still not sure whether they only apply to the real-time context.
BTW maybe first we should have a model registry use case or update the data registry with a model-specific section? Or is it basically the same thing (we can just mention "and models" more in https://dvc.org/doc/use-cases/data-registries) |
First, I should have mentioned that this issue is premature, so it's more of a placeholder for me to put down some thoughts until there is an established solution.
This issue is not really about how the model is built, which is what makes it different from #862. In #862, the dvc pipeline needs to be run on a regular schedule, most likely on some remote infrastructure (like a kubernetes cluster or other servers/cloud resources). Using the product recommendation example, the set of products available might change over time, so it's important to retrain the model regularly. In contrast, this issue covers how to take the trained model and use it to make recommendations when people visit the site.
Yes, although we may need to capture metadata about the models.
True, the real-time context does not necessarily need to be separate. The difference to me between batch and real-time inference is that the format of the input data is entirely different, but that's separate from model deployment.
To me, this issue is about how/why to use a model registry. |
@jorgeorpinel You can even close this issue if you prefer, and we can reopen when it's more realistic to tackle it. Again, just wanted to document some thoughts. |
@dberenbaum wrapping my mind around the idea that we might argue in the docs that people don't need separate model registries (vs or along with writing a use case for "model registry" that is similar to the current data registry). Love the idea! It's a powerful explanation of the DVC benefits in the "model registry" or CD scenarios. |
👍
No need, just trying to understand next steps for planning. This actually has more details than #862 which is nice. Probably we can put the questions of both together and from the answers several use case ideas will emerge. Then we pick one to tackle first.
Sounds like a great first use case related to DVC in production. Except perhaps for #2404 (covering CML directly) |
From #2490 (comment):
@dberenbaum do you want to elaborate on this idea or close in favor of #2490? Or leave it around for later. Up to you, thanks. |
I took "real time" out of the title to make it more general. I don't think it should be prioritized now in #2490. We can leave it open and come back to it later. |
Closing in favor of #2544. |
The scenario is that I train models using DVC pipelines, and now I want to use my trained model to make inferences on new data in near realtime as data streams through my production software application (unlike the scenarios described in #862, where a daily batch of data is input to the model). For example, if I train a model to recommend products on my site based in part on the last products the user clicked, model inference needs to happen fast using recent data that cannot be pre-computed.
The solution probably looks something like:
How should versions be managed? Having the model registry in between the development pipeline and the production app ensures that development can continue without breaking production, but it also introduces friction. The model developer needs to push model versions to a registry that is separate from their development repo and loses all of the info about how the model was developed. The production app has to rely on the model developer to specify which model version to use. When should validation or testing of the model happen?
Also, can dvc help with the data/feature engineering side of this scenario? Some data transformations may have to be computed in realtime, while others may be precomputed. The model development code is unlikely to be reusable in a realtime production app, since training relies on large batches of historical data rather than streams of realtime data. This issue seems more suited to a feature store than dvc, but is there any integration or suggested pattern that could help?
The text was updated successfully, but these errors were encountered: