-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic TypeScript definition generation from JSG RTTI #113
Conversation
@mrbbot ... ok, so the reason we had to revert this was twofold:
@mikea also needs to review the code changes here before it can land. The changes here LGTM but Mike is the domain expert so while I might sign off it should wait to land until both of the following happen:
|
11d9b65
to
f83ad19
Compare
This may have been fixed in newer commits, but slight bug in the version you posted; everything URL related is prefixed with URL. So, for instance, the URL type is urlURL. |
de64f5a
to
4cd2b6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through all changes outside of types/. Sending comments so far.
I know it could be a hustle, but maybe we can break this PR into several safer ones?
In particular I'd like to see the rtti change and npm setup as separate PRs, since they have very different risk profiles.
@@ -0,0 +1,183 @@ | |||
import assert from "assert"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kentonv maybe we want this in src/types
? or src/workerd/types
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kentonv .. just pinging again on this.
However, I don't think answering this should block this PR landing because we can always move this if necessary.
# | ||
# workerd uses Node.js scripts for generating TypeScript types. | ||
|
||
http_archive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will definitely make embedding workerd hard.
Maybe extract this to a separate function setup_js_dependencies
in bzl file so that it can be called by downstream deps if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, I don't think you can call load()
in functions (bazelbuild/bazel#1550)? So we wouldn't be able to call rules_js_dependencies
, rules_ts_dependencies
, npm_translate_lock
or npm_repositories
in our function? We have the same problem with Rust dependencies atm: the internal repository currently duplicates all the Rust WORKSPACE
rules. 😕
@penalosa we'll fix this when we add overrides that can rename types. 👍 The issue is that the spec-compliant URL implementation is under an additional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TS code is rather straightforward (though a bit dense). I have left some comments.
Having some testing is probably my biggest concern now.
import ts from "typescript"; | ||
import { printNode } from "../print"; | ||
|
||
// Replaces custom Iterator-like interfaces with built-in `Iterator` types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder: why do we generate something intermediate and patch it later, rather than generate correct property immediately?
I.e. when generating a property, check that its type is our iterator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing it this way allows us to keep the initial generation code simple and separate concerns. As JSG_ITERATOR
generates 2 structures (one for the iterator, one for the next result containing the value type we need), the initial generation functions would need to be aware of the full program as opposed to just the structure they were generating code for, making testing more difficult. This approach also allows us to use TypeScript's transformer/visitation APIs that were designed for tasks like this. I'm not too worried about squeezing performance out of this program either.
|
||
// If `typeNode` has the shape `T | undefined`, returns `T`, otherwise returns | ||
// `undefined`. | ||
export function maybeUnwrapOptional( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to consult generated types to glimpes information from the original structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we used the original structure information, we'd need to separately assert the generated type nodes had the expected optional shape T | undefined
in order to extract T
and satisfy TypeScript. Therefore, it makes sense to do the check like this, which narrows the TypeScript type for us. This helper is also useful in structure.ts
, when generating properties.
Exciting. |
// Input: | ||
// Binary Cap’n Proto file path, defaults to reading from stdin if omitted | ||
export async function main(args?: string[]) { | ||
const { values: options, positionals } = util.parseArgs({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just caution here that util.parseArgs()
is still considered experimental in Node.js. It's API can see breaking changes outside of normal semver rules. It's not expected to change, but it's worth keeping in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM. I don't really feel qualified to sign off on all the bazel changes, however, and I would note that @mikea has outstanding "Request Changes" here that I'm not entirely comfortable clearing just yet.
|
||
This directory contains scripts for automatically generating TypeScript types | ||
from [JSG RTTI](../src/workerd/jsg/rtti.h). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not blocking for this PR but a request for later: It would be worthwhile documenting the dependencies used here... specifically the minimum Node.js version. I know the bazel stuff brings in the right version but just for detail information, especially given that the tool uses a number of experimental Node.js APIs that are fairly recent additions.
It appears that all of @mikea's feedback has been addressed. I'll clear their "change requested" now. |
@mrbbot this should be ready to land. Once landed, remember to rebase the matching internal PR on workerd main and let the CI run again there to confirm it's ready to land. |
(re-do of #110, let's try this again 🙂 )
Hey! 👋 This PR adds support for generating TypeScript types from JSG RTTI, replacing the internal
autodecl
script. This forms the basis for the nextworkers-types
version. These scripts are located in this repository rather thanworkers-types
as they depend on Bazel outputs, and we'd like to be able to share the Bazel cache in CI. Going forward, DevProd should probably be theCODEOWNER
for everything in thetypes
directory.To generate types, run:
A gist containing the generated types can be found here. This also includes a copy of the types before TypeScript transformers are applied.
(fyi, about half of the additions are from
pnpm-lock.yaml
)Implementation Notes and Questions
jsg::fullyQualifiedTypeName()
method. This behaves likejsg::typeName()
, but includes namespaces and template arguments in the returned name. Namespaces are required to differentiate nested types with the same name (e.g.DurableObjectStorageOperations::GetOptions
,KvNamespace::GetOptions
andR2Bucket::GetOptions
). Template arguments are required to differentiatejsg::IteratorBase...
types generated byJSG_ITERATOR
.workerd::jsg::rtti::Builder
, replaced the symbol key fromjsg::typeName()
tojsg::fullyQualifiedTypeName()
for the reasons above. This also affects which values should be passed as parameters toworkerd::jsg::rtti::Builder::structure()
.rtti.capnp
schema. Generally, I tried to evolve the schema in a backwards-compatible way, but it would be cleaner if we're still able to make breaking changes. /cc @mikeafullyQualifiedName
fields toStructure
andStructureType
schemas for the reasons above. I was hesitant to makename
fully-qualified since I wasn't sure which other code depended on RTTI.jsg::LenientOptional
to RTTI. When generating TypeScript definitions, we also need to be able to distinguish between optionals expectingnull
(kj::Maybe
) and others expectingundefined
. A newname
field has been added to theMaybeType
schema, similar toNumberType
andStringType
.kj::Array
,kj::ArrayPtr
andjsg::Sequence
, so a newname
field has been added to theArrayType
schema. It may be better to make this and the previous maybe field enums instead?name
field in a group withnested
members. This ensures members coming fromJSG_NESTED_TYPE_NAMED
macros have the correct names.iterator
andasyncIterator
fields to theStructure
schema. Whilst there are alreadyiterable
andasyncIterable
boolean fields, we need to know the full method types (especially the returned(Async)Iterator
type) for[Symbol.iterator]
/[Symbol.asyncIterator]
.api-encoder.c++
entrypoint that spits out RTTI to a file. All TypeScript generation scripts are written in TypeScript, so we can use the official TypeScript compiler API for creating/processing/printing AST nodes.aspect-build/rules_js
requires us to use pnpm. We should be able to use this setup for packaging/publishingworkerd
npm
packages too. /cc @penalosaTODOs
(to follow in later PRs)
autodecl
😅), they're not as accurate as they could be.param0
,param1
, ... . Ideally, we'd use the actual C++ parameter names here, so we need some way of getting these into the RTTI.