Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(weave): (python-only) Refactor and rename code to more appropriately handle builtin_object_class not base_object_class #3248

Merged
merged 16 commits into from
Dec 16, 2024
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# BaseObjectClasses
# BuiltinObjectClasses

## Refresher on Objects and object storage

Expand Down Expand Up @@ -79,11 +79,11 @@ While many Weave Objects are free-form and user-defined, there is often a need f

Here's how to define and use a validated base object:

1. **Define your schema** (in `weave/trace_server/interface/base_object_classes/your_schema.py`):
1. **Define your schema** (in `weave/trace_server/interface/builtin_object_classes/your_schema.py`):

```python
from pydantic import BaseModel
from weave.trace_server.interface.base_object_classes import base_object_def
from weave.trace_server.interface.builtin_object_classes import base_object_def

class NestedConfig(BaseModel):
setting_a: int
Expand Down Expand Up @@ -116,7 +116,7 @@ curl -X POST 'https://trace.wandb.ai/obj/create' \
"project_id": "user/project",
"object_id": "my_config",
"val": {...},
"set_base_object_class": "MyConfig"
"object_class": "MyConfig"
}
}'

Expand Down Expand Up @@ -154,38 +154,38 @@ Run `make synchronize-base-object-schemas` to ensure the frontend TypeScript typ

### Architecture Flow

1. Define your schema in a python file in the `weave/trace_server/interface/base_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/base_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/base_object_classes/base_object_registry.py` by calling `register_base_object`.
1. Define your schema in a python file in the `weave/trace_server/interface/builtin_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/builtin_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/builtin_object_classes/builtin_object_registry.py` by calling `register_base_object`.
3. Run `make synchronize-base-object-schemas` to generate the frontend types.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/base_object_classes/generated/generated_base_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts`.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/builtin_object_classes/generated/generated_builtin_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts`.
4. Now, each use case uses different parts:
1. `Python Writing`. Users can directly import these classes and use them as normal Pydantic models, which get published with `weave.publish`. The python client correct builds the requisite payload.
2. `Python Reading`. Users can `weave.ref().get()` and the weave python SDK will return the instance with the correct type. Note: we do some special handling such that the returned object is not a WeaveObject, but literally the exact pydantic class.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish base objects by setting the `set_base_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish builtin objects (set of weave.Objects provided by Weave) by setting the `builtin_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
4. `HTTP Reading`. When querying for objects, the server will return the object with the correct type if the `base_object_class` metadata field is set.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
* Note: it is critical that all techniques produce the same digest for the same data - which is tested in the tests. This way versions are not thrashed by different clients/users.

```mermaid
graph TD
subgraph Schema Definition
F["weave/trace_server/interface/<br>base_object_classes/your_schema.py"] --> |defines| P[Pydantic BaseObject]
P --> |register_base_object| R["base_object_registry.py"]
P --> |register_base_object| R["builtin_object_registry.py"]
end
subgraph Schema Generation
M["make synchronize-base-object-schemas"] --> G["make generate_base_object_schemas"]
G --> |runs| S["weave/scripts/<br>generate_base_object_schemas.py"]
R --> |import registered classes| S
S --> |generates| J["generated_base_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBaseObjectClasses.zod.ts"]
S --> |generates| J["generated_builtin_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBuiltinObjectClasses.zod.ts"]
J --> Z
end
subgraph "Trace Server"
subgraph "HTTP API"
R --> |validates using| HW["POST obj/create<br>set_base_object_class"]
R --> |validates using| HW["POST obj/create<br>object_class"]
HW --> DB[(Weave Object Store)]
HR["POST objs/query<br>base_object_classes"] --> |Filters base_object_class| DB
end
Expand All @@ -203,7 +203,7 @@ graph TD
Z --> |import| UBI["useBaseObjectInstances"]
Z --> |import| UCI["useCreateBaseObjectInstance"]
UBI --> |Filters base_object_class| HR
UCI --> |set_base_object_class| HW
UCI --> |object_class| HW
UI[React UI] --> UBI
UI --> UCI
end
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
)
from tests.trace.util import client_is_sqlite
from weave.trace.weave_client import WeaveClient
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionSpec,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion tests/trace/test_actions_lifecycle.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import weave
from tests.trace.util import client_is_sqlite
from weave.trace.weave_client import WeaveClient
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionSpec,
)
from weave.trace_server.trace_server_interface import (
Expand Down
16 changes: 8 additions & 8 deletions tests/trace/test_base_object_classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from weave.trace.refs import ObjectRef
from weave.trace.weave_client import WeaveClient
from weave.trace_server import trace_server_interface as tsi
from weave.trace_server.interface.base_object_classes.test_only_example import (
from weave.trace_server.interface.builtin_object_classes.test_only_example import (
TestOnlyNestedBaseModel,
)

Expand Down Expand Up @@ -139,7 +139,7 @@ def test_interface_creation(client):
"project_id": client._project_id(),
"object_id": nested_obj_id,
"val": nested_obj.model_dump(),
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -164,7 +164,7 @@ def test_interface_creation(client):
"project_id": client._project_id(),
"object_id": top_level_obj_id,
"val": top_obj.model_dump(),
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand Down Expand Up @@ -271,7 +271,7 @@ def test_digest_equality(client):
"project_id": client._project_id(),
"object_id": nested_obj_id,
"val": nested_obj.model_dump(),
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand Down Expand Up @@ -300,7 +300,7 @@ def test_digest_equality(client):
"project_id": client._project_id(),
"object_id": top_level_obj_id,
"val": top_obj.model_dump(),
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand All @@ -322,7 +322,7 @@ def test_schema_validation(client):
"object_id": "nested_obj",
# Incorrect schema, should raise!
"val": {"a": 2},
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -340,7 +340,7 @@ def test_schema_validation(client):
"_class_name": "TestOnlyNestedBaseObject",
"_bases": ["BaseObject", "BaseModel"],
},
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -359,7 +359,7 @@ def test_schema_validation(client):
"_class_name": "TestOnlyNestedBaseObject",
"_bases": ["BaseObject", "BaseModel"],
},
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand Down
2 changes: 1 addition & 1 deletion weave/flow/annotation_spec.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from weave.trace_server.interface.base_object_classes import annotation_spec
from weave.trace_server.interface.builtin_object_classes import annotation_spec

# Re-export:
AnnotationSpec = annotation_spec.AnnotationSpec
2 changes: 1 addition & 1 deletion weave/flow/leaderboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from weave.trace.refs import OpRef
from weave.trace.weave_client import WeaveClient, get_ref
from weave.trace_server.interface.base_object_classes import leaderboard
from weave.trace_server.interface.builtin_object_classes import leaderboard
from weave.trace_server.trace_server_interface import CallsFilter


Expand Down
16 changes: 8 additions & 8 deletions weave/scripts/generate_base_object_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,30 @@

from pydantic import create_model

from weave.trace_server.interface.base_object_classes.base_object_registry import (
BASE_OBJECT_REGISTRY,
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import (
BUILTIN_OBJECT_REGISTRY,
)

OUTPUT_DIR = (
Path(__file__).parent.parent
/ "trace_server"
/ "interface"
/ "base_object_classes"
/ "builtin_object_classes"
/ "generated"
)
OUTPUT_PATH = OUTPUT_DIR / "generated_base_object_class_schemas.json"
OUTPUT_PATH = OUTPUT_DIR / "generated_builtin_object_class_schemas.json"


def generate_schemas() -> None:
"""
Generate JSON schemas for all registered base objects in BASE_OBJECT_REGISTRY.
Generate JSON schemas for all registered base objects in BUILTIN_OBJECT_REGISTRY.
Creates a top-level schema that includes all registered objects and writes it
to 'generated_base_object_class_schemas.json'.
to 'generated_builtin_object_class_schemas.json'.
"""
# Dynamically create a parent model with all registered objects as properties
CompositeModel = create_model(
"CompositeBaseObject",
**{name: (cls, ...) for name, cls in BASE_OBJECT_REGISTRY.items()},
**{name: (cls, ...) for name, cls in BUILTIN_OBJECT_REGISTRY.items()},
)

# Generate the schema using the composite model
Expand All @@ -39,7 +39,7 @@ def generate_schemas() -> None:
with OUTPUT_PATH.open("w") as f:
json.dump(top_level_schema, f, indent=2)

print(f"Generated schema for {len(BASE_OBJECT_REGISTRY)} objects")
print(f"Generated schema for {len(BUILTIN_OBJECT_REGISTRY)} objects")
print(f"Wrote schema to {OUTPUT_PATH.absolute()}")


Expand Down
2 changes: 1 addition & 1 deletion weave/trace/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
should_disable_weave,
)
from weave.trace.table import Table
from weave.trace_server.interface.base_object_classes import leaderboard
from weave.trace_server.interface.builtin_object_classes import leaderboard


def init(
Expand Down
2 changes: 1 addition & 1 deletion weave/trace/base_objects.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from weave.trace_server.interface.base_object_classes.base_object_registry import *
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import *
8 changes: 4 additions & 4 deletions weave/trace/serialize.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
from weave.trace.object_record import ObjectRecord
from weave.trace.refs import ObjectRef, TableRef, parse_uri
from weave.trace.sanitize import REDACT_KEYS, REDACTED_VALUE
from weave.trace_server.interface.base_object_classes.base_object_registry import (
BASE_OBJECT_REGISTRY,
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import (
BUILTIN_OBJECT_REGISTRY,
)
from weave.trace_server.trace_server_interface import (
FileContentReadReq,
Expand Down Expand Up @@ -262,9 +262,9 @@ def from_json(obj: Any, project_id: str, server: TraceServerInterface) -> Any:
elif (
isinstance(val_type, str)
and obj.get("_class_name") == val_type
and (baseObject := BASE_OBJECT_REGISTRY.get(val_type))
and (builtin_object_class := BUILTIN_OBJECT_REGISTRY.get(val_type))
):
return baseObject.model_validate(obj)
return builtin_object_class.model_validate(obj)
else:
return ObjectRecord(
{k: from_json(v, project_id, server) for k, v in obj.items()}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
from typing import Any

from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ContainsWordsActionConfig,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion weave/trace_server/actions_worker/actions/llm_judge.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
from typing import Any

from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
LlmJudgeActionConfig,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion weave/trace_server/actions_worker/dispatcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
do_contains_words_action,
)
from weave.trace_server.actions_worker.actions.llm_judge import do_llm_judge_action
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionConfigType,
ActionSpec,
ContainsWordsActionConfig,
Expand Down
Loading
Loading