-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(weave): Implement enhaced feedback structure and mvp filter/query layer #2865
Conversation
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=b418e59d89734fd30a1cedeb8e63879b483f1b03 |
assert feedback["payload"]["name"] == "score" | ||
assert feedback["payload"]["op_ref"] == get_ref(score).uri() | ||
assert feedback["payload"]["results"] == True | ||
assert feedback["feedback_type"] == "wandb.runnable.score" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok to change as the UI/query layer does not consume it yet.
@@ -39,9 +38,8 @@ def my_score(input_x: int, model_output: int) -> int: | |||
|
|||
assert len(calls) == 2 | |||
feedback = calls[0].summary["weave"]["feedback"][0] | |||
assert feedback["feedback_type"] == SCORE_TYPE_NAME | |||
assert feedback["feedback_type"] == "wandb.runnable.my_score" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, safe to change now that we have a good format
# We're using "beta.1" to indicate that this is a pre-release version. | ||
from typing import TypedDict | ||
|
||
SCORE_TYPE_NAME = "wandb.score.beta.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we learned from this - no longer needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new feedback query is going to be spicy in big projects, but looks good. The calls query builder is also feeling... clunky. Generally this makes sense, I wonder how much of the implementation we can abstract away from the user when adding, but still create an intuitive way for them to get the data out. It's possible that we might want to have some way of auto-constructing queries client-side, i'm imagining users not finding the following easy to use....
"$getField": "feedback.[wandb.runnable.my_scorer].payload.output.match"
) | ||
feedback_join_sql = f""" | ||
LEFT JOIN feedback | ||
ON (feedback.weave_ref = concat('weave-trace-internal:///', {_param_slot(project_param, 'String')}, '/call/', calls_merged.id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to do this concat in the query vs outside and pass it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think so? but not sure. We have to do a concat either way since the last part is dynamic
@@ -686,6 +686,18 @@ class FeedbackCreateReq(BaseModel): | |||
} | |||
] | |||
) | |||
annotation_ref: Optional[str] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if we could type this to a kind of ref, like objectRef
, with a pydantic validator and then check its construction in the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
This PR lays the groundwork for the next leg of feedback types in our system. Specifically, we have two "classes" of feedback:
runnables
andannotations
. "Runnables" are feedbacks that are generated by running a program (think: Op, Configured Action, Scorer), while "Annotations" are feedbacks created by humans with specific types (aka human in the loop, aka custom columns, etc...).There were three problems to solve with this emerging data model:
After much iteration and discussion, the solution that seemed most suitable is as follows:
feedback_type
how has 2 special prefixes:wandb.runnable
andwandb.annotation
, where the total type should bewandb.runnable.RUNNABLE_NAME
orwandb.annotation.ANNOTATION_NAME
. here,RUNNABLE_NAME
orANNOTATION_NAME
are thename
(akaobject_id
) components of the backing Object or Op. This is the most common group key and indexed in Clickhouse already.annotation_ref
: The ref pointing to the annotation definition for this feedback.runnable_ref
: The ref pointing to the runnable definition for this feedback.call_ref
: The ref pointing to the resulting call associated with generating this feedback.trigger_ref
: The ref pointing to the trigger definition which resulted in this feedback.feedback[feedback_type].payload.json.selector
. This allows us to specify the feedback type (while supporting dots) and match our other field access patterns.With all of this together, we can have code like:
... then query ...
This can easily be extended to support different aggregation logic and specific version selectors.