Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(weave): Refactor and rename code to more appropriately handle builtin_object_class not base_object_class #3229

Merged
merged 18 commits into from
Dec 17, 2024

Conversation

tssweeney
Copy link
Collaborator

@tssweeney tssweeney commented Dec 13, 2024

Note: should split and deploy backend before landing front end (python extraction: #3248)

This is just a refactor, but touches many files. Specifically, in this PR: #2826 I introduced a mechanism for creating weave objects via the API for known classes. This system introduced an API param: set_base_object_class: Optional[str] = None. Notice the base word here.

In nearly all the files, I used the term base for example base_object_class, baseObjectClass, BASE_OBJECT_REGISTRY, etc... This is used by Leaderboards and Annotation spec definitions.

However, as I am beginning to rely on this more for Scorers and Models, I realized that this was not correct. Let me take a minute to define these terms:

  • base_object_class: this is the name of the first subclass of any subclass of a weave Object class.
  • object_class: the string of the actual class.

Let's consider the following example:

class Scorer(weave.Object):
   ...

class LLMScorer(Scorer):
   ...

class HalucinationLLMScorer(LLMScorer):
   ...

scorer = HalucinationLLMScorer()

In this case, the class hierarchy is: HalucinationLLMScorer -> LLMScorer -> Scorer -> Object -> BaseModel

So, in our terms:

  • base_object_class = "Scorer"
  • object_class = "HalucinationLLMScorer"

Now, base_object_class is treated specially and extracted to be stored in the DB and has first-class filter support. This allows us to find all Scorers, Models, or other objects, even if they are sub classes of those "base" classes.

Furthermore, in the val payload of the object, we have special keys:

  • _type = "HalucinationLLMScorer"
  • _class_Name = "HalucinationLLMScorer"
  • _bases = ["LLMScorer", "Scorer", "Object", "BaseModel"]

Yes, _type and _class_name are the same.


So, why is this all important? Well, names are important! And the use of base here was not correct b/c it was intended to target the final object_name of some builtin class. Therefore, this PR effectively renames all cases of base to builtin where appropriate such that anytime base appears, it really means the base_object_class and is no longer ambiguous.

@circle-job-mirror
Copy link

circle-job-mirror bot commented Dec 13, 2024

@tssweeney tssweeney changed the title chore(weave): Refactor and rename code to more appropriately handle object_class not base_object_class chore(weave): Refactor and rename code to more appropriately handle builtin_object_class not base_object_class Dec 13, 2024
@@ -0,0 +1,151 @@
from typing import Any, Optional, TypedDict
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an edited version of the now deleted weave/trace_server/base_object_class_util.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly just comments and name changes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a rename of another file

@tssweeney tssweeney marked this pull request as ready for review December 14, 2024 00:06
@tssweeney tssweeney requested review from a team as code owners December 14, 2024 00:06
> = TraceObjSchema<BaseObjectClassType<C>, C>;

export const useBaseObjectInstances = <C extends BaseObjectClassRegistryKeys>(
// Notice: this is still `base` object class, not `builtin` object class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future readers might prefer a descriptive comment rather than this notice which feels more relevant for reviewers (which I appreciate)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 200 to 203
def model_post_init(self, __context: Any) -> None:
# If set_base_object_class is provided, use it to set builtin_object_class for backwards compatibility
if self.set_base_object_class is not None and self.builtin_object_class is None:
self.builtin_object_class = self.set_base_object_class
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat

Copy link
Member

@gtarpenning gtarpenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to have the docs team take a peek at the docs changes, lgtm

@tssweeney tssweeney enabled auto-merge (squash) December 17, 2024 04:06
@tssweeney tssweeney disabled auto-merge December 17, 2024 17:49
@tssweeney tssweeney merged commit efc4068 into master Dec 17, 2024
120 of 121 checks passed
@tssweeney tssweeney deleted the tim/fix_base_object_class_notation branch December 17, 2024 19:02
@github-actions github-actions bot locked and limited conversation to collaborators Dec 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants