Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case: deploying models for inference #2431

Closed
dberenbaum opened this issue Apr 29, 2021 · 9 comments
Closed

case: deploying models for inference #2431

dberenbaum opened this issue Apr 29, 2021 · 9 comments
Labels
A: docs Area: user documentation (gatsby-theme-iterative)

Comments

@dberenbaum
Copy link
Contributor

dberenbaum commented Apr 29, 2021

The scenario is that I train models using DVC pipelines, and now I want to use my trained model to make inferences on new data in near realtime as data streams through my production software application (unlike the scenarios described in #862, where a daily batch of data is input to the model). For example, if I train a model to recommend products on my site based in part on the last products the user clicked, model inference needs to happen fast using recent data that cannot be pre-computed.

The solution probably looks something like:

  • Train model using dvc pipeline.
  • Push to model registry.
  • Streaming app uses a DVC API to get specified model version.

How should versions be managed? Having the model registry in between the development pipeline and the production app ensures that development can continue without breaking production, but it also introduces friction. The model developer needs to push model versions to a registry that is separate from their development repo and loses all of the info about how the model was developed. The production app has to rely on the model developer to specify which model version to use. When should validation or testing of the model happen?

Also, can dvc help with the data/feature engineering side of this scenario? Some data transformations may have to be computed in realtime, while others may be precomputed. The model development code is unlikely to be reusable in a realtime production app, since training relies on large batches of historical data rather than streams of realtime data. This issue seems more suited to a feature store than dvc, but is there any integration or suggested pattern that could help?

@jorgeorpinel jorgeorpinel changed the title Use case: deploying models for realtime inference case: deploying models for real time inference Apr 29, 2021
@jorgeorpinel jorgeorpinel added the A: docs Area: user documentation (gatsby-theme-iterative) label Apr 29, 2021
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 29, 2021

use my trained model to make inferences on new data in near realtime

So is the main difference with #862 in how the model is built and on how to setup a real time prod env?
Wondering if it can be a chapter/version of #862 instead.

Streaming app uses a DVC API to get specified model version

Can the regular dvc.api suffice for now?

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 29, 2021

How should versions be managed?

OK, I see that paragraph has the DVC-related questions. But I'm still not sure whether they only apply to the real-time context.

push model versions to a registry

BTW maybe first we should have a model registry use case or update the data registry with a model-specific section? Or is it basically the same thing (we can just mention "and models" more in https://dvc.org/doc/use-cases/data-registries)

@dberenbaum dberenbaum changed the title case: deploying models for real time inference Use case: deploying models for real time inference Apr 29, 2021
@dberenbaum
Copy link
Contributor Author

dberenbaum commented Apr 29, 2021

First, I should have mentioned that this issue is premature, so it's more of a placeholder for me to put down some thoughts until there is an established solution.

So is the main difference with #862 in how the model is built and on how to setup a real time prod env? Wondering if it can be a chapter/version of #862 instead.

This issue is not really about how the model is built, which is what makes it different from #862. In #862, the dvc pipeline needs to be run on a regular schedule, most likely on some remote infrastructure (like a kubernetes cluster or other servers/cloud resources). Using the product recommendation example, the set of products available might change over time, so it's important to retrain the model regularly. In contrast, this issue covers how to take the trained model and use it to make recommendations when people visit the site.

Can the regular dvc.api suffice for now?

Yes, although we may need to capture metadata about the models.

OK, I see that paragraph has the DVC-related questions. But I'm still not sure whether they only apply to the real-time context.

True, the real-time context does not necessarily need to be separate. The difference to me between batch and real-time inference is that the format of the input data is entirely different, but that's separate from model deployment.

BTW maybe first we should have a model registry use case or update the data registry with a model-specific section? Or is it basically the same thing (we can just mention "and models" more in https://dvc.org/doc/use-cases/data-registries)

To me, this issue is about how/why to use a model registry.

@dberenbaum
Copy link
Contributor Author

@jorgeorpinel You can even close this issue if you prefer, and we can reopen when it's more realistic to tackle it. Again, just wanted to document some thoughts.

@shcheklein
Copy link
Member

@dberenbaum wrapping my mind around the idea that we might argue in the docs that people don't need separate model registries (vs or along with writing a use case for "model registry" that is similar to the current data registry). Love the idea! It's a powerful explanation of the DVC benefits in the "model registry" or CD scenarios.

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 30, 2021

this issue covers how to take the trained model and use it to make recommendations

👍

You can even close this issue

No need, just trying to understand next steps for planning. This actually has more details than #862 which is nice. Probably we can put the questions of both together and from the answers several use case ideas will emerge. Then we pick one to tackle first.

this issue is about how/why to use a model registry

Sounds like a great first use case related to DVC in production. Except perhaps for #2404 (covering CML directly)

@dberenbaum dberenbaum changed the title Use case: deploying models for real time inference case: deploying models for real time inference Apr 30, 2021
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented May 19, 2021

From #2490 (comment):

feel too narrow, need to find a better angle

@dberenbaum do you want to elaborate on this idea or close in favor of #2490? Or leave it around for later. Up to you, thanks.

@dberenbaum dberenbaum changed the title case: deploying models for real time inference case: deploying models for inference May 20, 2021
@dberenbaum
Copy link
Contributor Author

I took "real time" out of the title to make it more general. I don't think it should be prioritized now in #2490. We can leave it open and come back to it later.

@jorgeorpinel
Copy link
Contributor

Closing in favor of #2544.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative)
Projects
None yet
Development

No branches or pull requests

3 participants