-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSPy Only Predict #567
DSPy Only Predict #567
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic man! Really appreciate you. Looks like you went through quite a concept maze to get here and the result shows it.
Pinging @okhat to explain 'stage' in your TODO, had 1 nit. lmk what you think
# TODO: What purpose does stage play here? | ||
# assert self.stage in x, "The generated (input, output) example was not stored" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@okhat can you comment here?
I decided to sneak in a fleshed out JSONBackend, will add tests such that the Predict works with either backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small bug but overall looks great!
This is done, we have feature parity with DSPy from about a week or two ago, prior to COPRO/MIPRO. We should only be failing on tests for the signature_opt, but that has been replaced by COPRO. |
9fa4e12
into
stanfordnlp:backend-refactor
This PR is intended to get a "DSPy Only Predict" running, ie. No
dsp
code running.This is ultimately the point at which the Backend refactor becomes breaking,
TODOs, This includes a few main points:
1. Move over
dsp
's predict primitives:dspy.Template
to extract each field regardless of whether all fields were generated successfully.max_tokens
,temperature
and recursive workflowmax
_retries` are respected and exposed appropriately.2. Introduce new
DummyLanguageModel
for use with theBaseLM
.TemplateBackend
generate calls without recovery.TemplateBackend
generate calls with recovery.3. Update
Predict
dsp.Example
anddsp.Template
todspy.Example
anddspy.Template
.Predict
tests to ensure functionality remains consistent.Questions
1. Do we expect the 'non-Template' backends to use recursive healing functionality.
I imagine, non-Templates would not need this functionality. I have currently included it within the
BaseBackend
but it could be pulled out and included within theTemplateBackend
only.2. Load/Dump State for the current
dspy.Predict
class will be broken with this change.Ultimately, starting this integration across the different module classes is going to be breaking. Are there specific concerns we have regarding this, how do we want to manage this? Migration details etc. Also is it worth merging the backend implementations earlier than transitioning the Module classes?
Let me know what you think? @CyrusOfEden @okhat.