Skip to content

Commit

Permalink
chore(weave): (python-only) Refactor and rename code to more appropri…
Browse files Browse the repository at this point in the history
…ately handle builtin_object_class not base_object_class (#3248)

* initial changes

* change to BUILTIN_OBJECT_REGISTRY

* change to BUILTIN_OBJECT_REGISTRY

* change to big change to builtin

* wow, this is getting to be a lot

* lint

* lint

* fix generation

* i think complete

* fixed a few errros

* fixed a few names

* little touchups

* init

* init

* small fix
  • Loading branch information
tssweeney authored Dec 16, 2024
1 parent 6d95ebb commit a6886f5
Show file tree
Hide file tree
Showing 28 changed files with 661 additions and 214 deletions.
30 changes: 15 additions & 15 deletions dev_docs/BaseObjectClasses.md → dev_docs/BuiltinObjectClasses.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# BaseObjectClasses
# BuiltinObjectClasses

## Refresher on Objects and object storage

Expand Down Expand Up @@ -79,11 +79,11 @@ While many Weave Objects are free-form and user-defined, there is often a need f

Here's how to define and use a validated base object:

1. **Define your schema** (in `weave/trace_server/interface/base_object_classes/your_schema.py`):
1. **Define your schema** (in `weave/trace_server/interface/builtin_object_classes/your_schema.py`):

```python
from pydantic import BaseModel
from weave.trace_server.interface.base_object_classes import base_object_def
from weave.trace_server.interface.builtin_object_classes import base_object_def

class NestedConfig(BaseModel):
setting_a: int
Expand Down Expand Up @@ -116,7 +116,7 @@ curl -X POST 'https://trace.wandb.ai/obj/create' \
"project_id": "user/project",
"object_id": "my_config",
"val": {...},
"set_base_object_class": "MyConfig"
"object_class": "MyConfig"
}
}'

Expand Down Expand Up @@ -154,38 +154,38 @@ Run `make synchronize-base-object-schemas` to ensure the frontend TypeScript typ

### Architecture Flow

1. Define your schema in a python file in the `weave/trace_server/interface/base_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/base_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/base_object_classes/base_object_registry.py` by calling `register_base_object`.
1. Define your schema in a python file in the `weave/trace_server/interface/builtin_object_classes/test_only_example.py` directory. See `weave/trace_server/interface/builtin_object_classes/test_only_example.py` as an example.
2. Make sure to register your schemas in `weave/trace_server/interface/builtin_object_classes/builtin_object_registry.py` by calling `register_base_object`.
3. Run `make synchronize-base-object-schemas` to generate the frontend types.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/base_object_classes/generated/generated_base_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts`.
* The first step (`make generate_base_object_schemas`) will run `weave/scripts/generate_base_object_schemas.py` to generate a JSON schema in `weave/trace_server/interface/builtin_object_classes/generated/generated_builtin_object_class_schemas.json`.
* The second step (yarn `generate-schemas`) will read this file and use it to generate the frontend types located in `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts`.
4. Now, each use case uses different parts:
1. `Python Writing`. Users can directly import these classes and use them as normal Pydantic models, which get published with `weave.publish`. The python client correct builds the requisite payload.
2. `Python Reading`. Users can `weave.ref().get()` and the weave python SDK will return the instance with the correct type. Note: we do some special handling such that the returned object is not a WeaveObject, but literally the exact pydantic class.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish base objects by setting the `set_base_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
3. `HTTP Writing`. In cases where the client/user does not want to add the special type information, users can publish builtin objects (set of weave.Objects provided by Weave) by setting the `builtin_object_class` setting on `POST obj/create` to the name of the class. The weave server will validate the object against the schema, update the metadata fields, and store the object.
4. `HTTP Reading`. When querying for objects, the server will return the object with the correct type if the `base_object_class` metadata field is set.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBaseObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
5. `Frontend`. The frontend will read the zod schema from `weave-js/src/components/PagePanelComponents/Home/Browse3/pages/wfReactInterface/generatedBuiltinObjectClasses.zod.ts` and use that to provide compile time type safety when using `useBaseObjectInstances` and runtime type safety when using `useCreateBaseObjectInstance`.
* Note: it is critical that all techniques produce the same digest for the same data - which is tested in the tests. This way versions are not thrashed by different clients/users.

```mermaid
graph TD
subgraph Schema Definition
F["weave/trace_server/interface/<br>base_object_classes/your_schema.py"] --> |defines| P[Pydantic BaseObject]
P --> |register_base_object| R["base_object_registry.py"]
P --> |register_base_object| R["builtin_object_registry.py"]
end
subgraph Schema Generation
M["make synchronize-base-object-schemas"] --> G["make generate_base_object_schemas"]
G --> |runs| S["weave/scripts/<br>generate_base_object_schemas.py"]
R --> |import registered classes| S
S --> |generates| J["generated_base_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBaseObjectClasses.zod.ts"]
S --> |generates| J["generated_builtin_object_class_schemas.json"]
M --> |yarn generate-schemas| Z["generatedBuiltinObjectClasses.zod.ts"]
J --> Z
end
subgraph "Trace Server"
subgraph "HTTP API"
R --> |validates using| HW["POST obj/create<br>set_base_object_class"]
R --> |validates using| HW["POST obj/create<br>object_class"]
HW --> DB[(Weave Object Store)]
HR["POST objs/query<br>base_object_classes"] --> |Filters base_object_class| DB
end
Expand All @@ -203,7 +203,7 @@ graph TD
Z --> |import| UBI["useBaseObjectInstances"]
Z --> |import| UCI["useCreateBaseObjectInstance"]
UBI --> |Filters base_object_class| HR
UCI --> |set_base_object_class| HW
UCI --> |object_class| HW
UI[React UI] --> UBI
UI --> UCI
end
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
)
from tests.trace.util import client_is_sqlite
from weave.trace.weave_client import WeaveClient
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionSpec,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion tests/trace/test_actions_lifecycle.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import weave
from tests.trace.util import client_is_sqlite
from weave.trace.weave_client import WeaveClient
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionSpec,
)
from weave.trace_server.trace_server_interface import (
Expand Down
16 changes: 8 additions & 8 deletions tests/trace/test_base_object_classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from weave.trace.refs import ObjectRef
from weave.trace.weave_client import WeaveClient
from weave.trace_server import trace_server_interface as tsi
from weave.trace_server.interface.base_object_classes.test_only_example import (
from weave.trace_server.interface.builtin_object_classes.test_only_example import (
TestOnlyNestedBaseModel,
)

Expand Down Expand Up @@ -139,7 +139,7 @@ def test_interface_creation(client):
"project_id": client._project_id(),
"object_id": nested_obj_id,
"val": nested_obj.model_dump(),
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -164,7 +164,7 @@ def test_interface_creation(client):
"project_id": client._project_id(),
"object_id": top_level_obj_id,
"val": top_obj.model_dump(),
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand Down Expand Up @@ -271,7 +271,7 @@ def test_digest_equality(client):
"project_id": client._project_id(),
"object_id": nested_obj_id,
"val": nested_obj.model_dump(),
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand Down Expand Up @@ -300,7 +300,7 @@ def test_digest_equality(client):
"project_id": client._project_id(),
"object_id": top_level_obj_id,
"val": top_obj.model_dump(),
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand All @@ -322,7 +322,7 @@ def test_schema_validation(client):
"object_id": "nested_obj",
# Incorrect schema, should raise!
"val": {"a": 2},
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -340,7 +340,7 @@ def test_schema_validation(client):
"_class_name": "TestOnlyNestedBaseObject",
"_bases": ["BaseObject", "BaseModel"],
},
"set_base_object_class": "TestOnlyNestedBaseObject",
"builtin_object_class": "TestOnlyNestedBaseObject",
}
}
)
Expand All @@ -359,7 +359,7 @@ def test_schema_validation(client):
"_class_name": "TestOnlyNestedBaseObject",
"_bases": ["BaseObject", "BaseModel"],
},
"set_base_object_class": "TestOnlyExample",
"builtin_object_class": "TestOnlyExample",
}
}
)
Expand Down
2 changes: 1 addition & 1 deletion weave/flow/annotation_spec.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from weave.trace_server.interface.base_object_classes import annotation_spec
from weave.trace_server.interface.builtin_object_classes import annotation_spec

# Re-export:
AnnotationSpec = annotation_spec.AnnotationSpec
2 changes: 1 addition & 1 deletion weave/flow/leaderboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from weave.trace.refs import OpRef
from weave.trace.weave_client import WeaveClient, get_ref
from weave.trace_server.interface.base_object_classes import leaderboard
from weave.trace_server.interface.builtin_object_classes import leaderboard
from weave.trace_server.trace_server_interface import CallsFilter


Expand Down
16 changes: 8 additions & 8 deletions weave/scripts/generate_base_object_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,30 @@

from pydantic import create_model

from weave.trace_server.interface.base_object_classes.base_object_registry import (
BASE_OBJECT_REGISTRY,
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import (
BUILTIN_OBJECT_REGISTRY,
)

OUTPUT_DIR = (
Path(__file__).parent.parent
/ "trace_server"
/ "interface"
/ "base_object_classes"
/ "builtin_object_classes"
/ "generated"
)
OUTPUT_PATH = OUTPUT_DIR / "generated_base_object_class_schemas.json"
OUTPUT_PATH = OUTPUT_DIR / "generated_builtin_object_class_schemas.json"


def generate_schemas() -> None:
"""
Generate JSON schemas for all registered base objects in BASE_OBJECT_REGISTRY.
Generate JSON schemas for all registered base objects in BUILTIN_OBJECT_REGISTRY.
Creates a top-level schema that includes all registered objects and writes it
to 'generated_base_object_class_schemas.json'.
to 'generated_builtin_object_class_schemas.json'.
"""
# Dynamically create a parent model with all registered objects as properties
CompositeModel = create_model(
"CompositeBaseObject",
**{name: (cls, ...) for name, cls in BASE_OBJECT_REGISTRY.items()},
**{name: (cls, ...) for name, cls in BUILTIN_OBJECT_REGISTRY.items()},
)

# Generate the schema using the composite model
Expand All @@ -39,7 +39,7 @@ def generate_schemas() -> None:
with OUTPUT_PATH.open("w") as f:
json.dump(top_level_schema, f, indent=2)

print(f"Generated schema for {len(BASE_OBJECT_REGISTRY)} objects")
print(f"Generated schema for {len(BUILTIN_OBJECT_REGISTRY)} objects")
print(f"Wrote schema to {OUTPUT_PATH.absolute()}")


Expand Down
2 changes: 1 addition & 1 deletion weave/trace/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
should_disable_weave,
)
from weave.trace.table import Table
from weave.trace_server.interface.base_object_classes import leaderboard
from weave.trace_server.interface.builtin_object_classes import leaderboard


def init(
Expand Down
2 changes: 1 addition & 1 deletion weave/trace/base_objects.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from weave.trace_server.interface.base_object_classes.base_object_registry import *
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import *
8 changes: 4 additions & 4 deletions weave/trace/serialize.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
from weave.trace.object_record import ObjectRecord
from weave.trace.refs import ObjectRef, TableRef, parse_uri
from weave.trace.sanitize import REDACT_KEYS, REDACTED_VALUE
from weave.trace_server.interface.base_object_classes.base_object_registry import (
BASE_OBJECT_REGISTRY,
from weave.trace_server.interface.builtin_object_classes.builtin_object_registry import (
BUILTIN_OBJECT_REGISTRY,
)
from weave.trace_server.trace_server_interface import (
FileContentReadReq,
Expand Down Expand Up @@ -262,9 +262,9 @@ def from_json(obj: Any, project_id: str, server: TraceServerInterface) -> Any:
elif (
isinstance(val_type, str)
and obj.get("_class_name") == val_type
and (baseObject := BASE_OBJECT_REGISTRY.get(val_type))
and (builtin_object_class := BUILTIN_OBJECT_REGISTRY.get(val_type))
):
return baseObject.model_validate(obj)
return builtin_object_class.model_validate(obj)
else:
return ObjectRecord(
{k: from_json(v, project_id, server) for k, v in obj.items()}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
from typing import Any

from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ContainsWordsActionConfig,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion weave/trace_server/actions_worker/actions/llm_judge.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
from typing import Any

from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
LlmJudgeActionConfig,
)
from weave.trace_server.trace_server_interface import (
Expand Down
2 changes: 1 addition & 1 deletion weave/trace_server/actions_worker/dispatcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
do_contains_words_action,
)
from weave.trace_server.actions_worker.actions.llm_judge import do_llm_judge_action
from weave.trace_server.interface.base_object_classes.actions import (
from weave.trace_server.interface.builtin_object_classes.actions import (
ActionConfigType,
ActionSpec,
ContainsWordsActionConfig,
Expand Down
Loading

0 comments on commit a6886f5

Please sign in to comment.