Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace espree and doctrine with typescript in docgen. #21238

Closed
wants to merge 70 commits into from

Conversation

sainthkh
Copy link
Contributor

@sainthkh sainthkh commented Mar 29, 2020

Description

How has this been tested?

  • Tested by refactoring docgen tests.
  • And added tests when necessary. (Like type generation)

Screenshots

N/A

Types of changes

Bug fix

Changes

  • Object -> object. TypeScript treats Object as any. And it's the first don't of the do's and don'ts list. It might need a lot of changes in the code.
    • In TypeScript, object means {} thing. Object means this global object. That's why it is not recommended to use Object.
  • Array<X>, Array.<X>, X[] -> X[]. Type generator doesn't respect how it is written. Actually, there is no way to find which is written in which way. I thought [] is a visual way of telling "it's an array"
    • The alternative might use [] for primitives and Array<> for user types that start with a capital letter like Array<ReactNode> or all Array<>.
  • I removed line numbers in IR nodes. Because we need to calculate it every time we need it. It is not calculated by default like espree 4.x
  • TypeScript notation over JSDoc. I did it because it's more clear. It means:
    • string | undefined over string=
    • number | nullable over ?number`
  • When there is a default value, it's added at the back like this: - _status_ string: Notice status. (Default: info) check here
    • I thought the type name is a bit vague and thought about adding (optional) after name like boolean (optional). I'm not sure about this and didn't implement it.
  • I indented properties like below. It's how TypeScript handles jsdoc objects and I think it looks better.
-   _options_ `object | undefined`: Notice options.
    -   _context_ `string`: Context under which to group notice. (Default: `global`)
    -   _id_ `string | undefined`: Identifier for notice. Automatically assigned if not specified.
    -   _isDismissible_ `boolean`: Whether the notice can be dismissed by user. (Default: `true`)
    -   _type_ `string`: Type of notice, one of `default`, or `snackbar`. (Default: `default`)
    -   _speak_ `boolean`: Whether the notice content should be announced to screen readers. (Default: `true`)
    -   _actions_ `WPNoticeAction[] | undefined`: User actions to be presented with notice.

Check code here

Later works

In this PR, I focused on migrating and tried not to add new features as much as possible. Because it's big (changes in 120 files with +3,454 −8,408) and it can make code hard to review and get merged.

The list below is the things that should be added for better developer experience.

  • Create separate eslint jsdoc config for projects type-checked by typescript.
    • TypeScript checks types, we don't need some eslint-plugin-jsdoc configs like valid-types. We can remove /* eslint-disable */s for perfectly ok types like {[setting:string]:any}.
    • Turn off 'object -> Object' rule.
  • Handle JSDoc types defined in non-last or non-exported comments. This makes users check source code to see what's going on.
    • Currently, docgen cannot handle docgen types like below:
/**
 * Type below is ignored.
 * @typedef WPLinkControlProps
 *
 * @property {(WPLinkControlSetting[])=} settings An array of settings objects.
 */

/**
 * Function
 * @param {WPLinkControlProps} props Component props.
 */
export function LinkControl( {
	value,
} ) { }
/**
 * Type below is ignored.
 * @typedef WPLinkControlProps
 *
 * @property {(WPLinkControlSetting[])=} settings An array of settings objects.
 */

// Value for test.
const CREATE_TYPE = '__CREATE__';

/**
 * Function
 * @param {WPLinkControlProps} props Component props.
 */
export function LinkControl( {
	value,
} ) { }
  • Throw errors when something wrong happened like wrong type name, Object.
  • Known issues for parsing default values
    • array in objects like [p={x:[1, 2]}].
    • escaping string-opening character like "abc\"def\"ghi".
  • Add docgen to TS-enabled project. (Testing: Add TypeScript types validation for modules #18838)
  • Updating TS for docgen isn't just changing numbers. For example, we need to add new awaited type for TS3.9
  • Debug mode: print out ast and ir to files. It's not possible now because of the parent element of the TypeScript ast.

Note

Before this PR is merged, it should be rebased and docs should be regenerated. If not, it can break other PRs.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • [N/A] My code follows the accessibility standards.
  • My code has proper inline documentation.
  • [N/A] I've included developer documentation if appropriate.
  • [N/A] I've updated all React Native files affected by any refactorings/renamings in this PR.

@oandregal
Copy link
Member

Hey, @sainthkh thanks for working on this, and I'm excited to help with this. I want to set expectations as to when would I be able to, though: I think I can come to this by next week.

@oandregal oandregal added [Tool] Docgen /packages/docgen [Type] Developer Documentation Documentation for developers labels Mar 30, 2020
@sainthkh
Copy link
Contributor Author

@nosolosw Thanks. I can wait. There're some PRs of mine that took 3+ months to merge. It doesn't matter to me.

@gziolo
Copy link
Member

gziolo commented Mar 30, 2020

@sainthkh, just wanted to say that I appreciate all the work spent on this PR 👍

@nosolosw is the best person to validate the implementation proposed.

@aduth and @sirreal – they should be able to help validate all the changes proposed to bridge JSDoc and TypeScript.

Copy link
Member

@sirreal sirreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an amazing effort we'll really benefit from. Thanks!

I've done an initial scan and left some notes and questions.

@@ -210,8 +210,7 @@ function LinkControl( {
// is still unassigned after calling `onChange`.
const hadFocusLoss =
isEndingEditWithFocus.current &&
wrapperNode.current &&
! wrapperNode.current.contains( document.activeElement );
! wrapperNode.current?.contains( document.activeElement );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change required for some reason?

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line of code is why everything started. After #19831, everyone thought that we could use optional chaining in Gutenberg.

But we couldn't because espree 4.x in docgen cannot parse it. @aduth thought we can simply do it by updating espree to the latest version (#19952). Unfortunately, espree doesn't support leadingComment properties after 5.0. So, we had to change the parser. (Older version of docgen extract JSDoc from leadingComment and parsed it with doctrine. And doctrine is deprecated and unused.)

As TypeScript can parse JS and JSDoc, it was chosen.

This line means that we can use optional chaining without any error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to back it out at this point, but I'd generally suggest to find ways to limit the scope of a pull request as much as possible, especially if it's already a +3400/-8400 proposal. It's definitely related as the original motivation of the changes, but I think it could have been fine to do as a separate follow-up pull request.

Copy link
Contributor Author

@sainthkh sainthkh Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's not a topic of this PR, it should be tested.

That's why I reverted it and added new test new-syntax under test/fixtures.

Later, when new syntax doesn't work or want to check if a new syntax works, we can use this file.

module.exports = function( filePath ) {
const options = {
allowJs: true,
target: ts.ScriptTarget.ES2020,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeat:

Should we use an evergreen target like ESNext?

Suggested change
target: ts.ScriptTarget.ES2020,
target: ts.ScriptTarget.ESNext,

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't important because this option is for the transpiled result of the code. It's not for how TypeScript treats new syntaxes. In other words, even when we set this to ES3, TypeScript can compile optional chaining and other ES2015-2020 features. But in this case, the result of the transpilation can be run on the environments that only support ES3 (maybe notorious IE9?).

We don't need transpiled JavaScript in docgen. All we need is the generated AST. So, it doesn't matter.

But to avoid confusion, I removed target option from compile.js and changed ES2020 to ESNext in other files. (8d6f136)

packages/docgen/scripts/dump-ast.js Outdated Show resolved Hide resolved
const code = raw
// typescript interprets @wordpress in @example code as a JSDoc tag.
// So, it should be replaced for the time being.
.replace( /@wordpress/g, '__WORDPRESS_IMPORT__' )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces @wordpress globally in the source, not just comments?

So, it should be replaced for the time being.

Are we tracking some condition when this could be removed?

Do you think we could hit issues where e.g. import escapeHtml from '@wordpress/escape-html' becomes import escapeHtml from 'WORDPRESS_IMPORT/escape-html'?

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. I commented when they're restored. (5ca2084)

2. That can be a problem when we have written code that exports imported items directly from other packages like this:

// packages/some-package/src/index.js

import escapeHtml from '@wordpress/escape-html';

export { escapeHtml }

In this case, the parser tries to open packages/escape-html/src/index.js and find jsdoc of escapeHtml. Then, it can fail because __WORDPRESS_IMPORT__ doesn't exist. I guess it's more of a design flaw than a feature. Don't you think so?

Maybe, but for those rare cases, we need to show human-readable reason to the users than cannot find module '__WORDPRESS_IMPORT__'

3. But your comment made me curious about this case:

/**
* @param {import('@wordpress/element').WPSyntheticEvent} p what if @wordpress?
*/

I found that it shows import( '__WORDPRESS_IMPORT__/element' ). I fixed it. (68d1198)

Comment on lines 37 to 44
const getLocalName = ( s ) => {
switch ( s.kind ) {
case SyntaxKind.ClassDeclaration:
case SyntaxKind.FunctionDeclaration:
default:
return s.name ? s.name.text : '*default*';
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the switch do anything relevant here?

Suggested change
const getLocalName = ( s ) => {
switch ( s.kind ) {
case SyntaxKind.ClassDeclaration:
case SyntaxKind.FunctionDeclaration:
default:
return s.name ? s.name.text : '*default*';
}
};
const getLocalName = ( s ) => s.name ? s.name.text : '*default*';

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was migrating getLocalName(), I copied the original function first to think what can be changed.

In the original, switch makes perfect sense. But in the new one, we don't need it. So, I removed it entirely. (8e9839c)

* - export default function myDeclaration() {}
* - export class MyDeclaration {}
* - export function myDeclaration() {}
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cases that don't seem to be covered in this list, are these handled properly?

export { default as foo, createContext as cc } from 'react'
const bar = 'baz'
const myDefault = 'd'
export { bar as quux, myDefault as default }

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added them here because I want to be clear which condition handles which AST case. I thought your examples could be deduced from export { a, b } from 'c' and export { myDeclaration };, but it seems that it wasn't. I added them. (39bc2ac)

And they're handled properly and tested with unit tests.

Comment on lines 109 to 123
if ( rawValue[ 0 ] === '{' && rawValue[ rawValue.length - 1 ] === '}' ) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty conditional block here?

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first thought about parsing objects and arrays. But I realized that to add them to docs, I need to return them back to strings. And it's unnecessary. So, I gave it up.

But I forgot to remove the code for that attempt. (85a6aee)

Comment on lines 105 to 107
if ( rawValue === 'null' ) return null;
if ( rawValue === 'true' ) return true;
if ( rawValue === 'false' ) return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize we allowed this, I'd expect linter to enforce braces on these conditional blocks.

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, curly rule doesn't work. When I add it to root .eslintrc.js, it works. But somehow, this rule is ignored although it's set to [ 'error', 'all' ] in eslint-plugin/configs/es5.js.

Fixed them anyway. (2a37323)

We need to open an issue for this, right?

Comment on lines 80 to 87
// if ( result === undefined ) {
// process.stdout.write(
// '\nFile was processed, but contained no ES6 module exports:'
// );
// process.stdout.write( `\n${ sourceFilePath }` );
// process.stdout.write( '\n\n' );
// process.exit( 0 );
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove or restore this commented out section?

Copy link
Contributor Author

@sainthkh sainthkh Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revived it. (b7118f1)

@sainthkh sainthkh force-pushed the update/19952-docgen-ts branch from c7bf7b1 to 7762075 Compare March 31, 2020 02:09
@sainthkh sainthkh requested review from sirreal and removed request for etoledom and jorgefilipecosta March 31, 2020 02:15
@sainthkh
Copy link
Contributor Author

@sirreal Thanks for all the reviews. I fixed them all.

@sainthkh sainthkh force-pushed the update/19952-docgen-ts branch from dd74e79 to d150875 Compare April 3, 2020 01:00
@sainthkh
Copy link
Contributor Author

sainthkh commented Apr 3, 2020

@aduth

1. When I was writing this, I also thought about the length of this PR. In my experience, when PR is too long, they're abandoned because it's too hard to follow the changes.

2. But the problem is that this PR is based on the entirely new AST. As TypeScript should incorporate its own type syntaxes, they had to create their own AST, so TypeScript AST is quite different from other ASTs.

(When you compare the ASTs of these 3 different parses (espree, babel, recast), they're mostly same.)

Because of that, almost all of docgen code is re-written.

That's why it's hard to break this PR into a few smaller PRs. One step is tightly related to the next step.

3. That made me also think how to make this PR easier to review.

First thing I did was to revert Update Doc commit for the time being. It'll break Build artifacts test. I tested it with my local clone. It became much easier to review. We can update docs when code review is done.

I'm also thinking about writing about the code flow and other helpful notes or even setting up small goals to focus on small areas of code. But I also think this can bias the code review. I'm a bit hesitant about setting goals for review, or should I?

I also thought about removing changes in the list of changes above. It does make us discuss less by postponing some issues later in other new issues. But it doesn't make me remove some codes to make it easier to review.

@sainthkh
Copy link
Contributor Author

sainthkh commented Apr 5, 2020

@aduth I created 2 mini PRs: #21404, #21405.

@gziolo
Copy link
Member

gziolo commented Apr 5, 2020

I landed #21405 👍

@oandregal
Copy link
Member

Hi @sainthkh, thanks for your patience! The first thing I want to say is that I really appreciate all the love and work you've put into this, bravo! 👏

This is promising, although I was wondering what do you think of these:

  • One of the issues we have now is that we're using two different parsers (espree for docs & babel for code). If we do this, we'll still have two, so I worry that we may end up in the same place we're now. Is there any way we can consolidate both? Can we use babel for docgen and delegate the JSDoc (with TypeScript notation) to something else (perhaps the typescript parser)?

  • From what I gather, the TypeScript AST is different from the ESTree-based ones that most tools in the ecosystem use. I was thinking whether this could pose problems when/if we want to integrate with some of those (ex: reusing parts of react-docgen to document the largely undocumented React components we have, testing, etc).

@sainthkh
Copy link
Contributor Author

1.

When I first decided to use typescript, I expected TypeScript to correctly parse and generate type name from TypeScript AST. And as we all know, it didn't. (Because of that, I needed to create this file.) Major reason for choosing TypeScript was false. So, I don't mind changing it to @babel/parser. When we use babel, we can reuse most of the code. (I think it's another advantage.)

2.

I am against using typescript as a JSDoc parser. Because when you run the typescript-version of docgen command. You can feel that it's slower than espree version. It seems that typescript has a lot of things to load. (It's not a surprise because it needs to load its type checkers beside JS parsing.)

There is a package called comment-parser. If we decide to use @babel/parser as the main parser for JS, I would use it as the parser for JSDoc comments.

It's a simple parser. It becomes both our advantage and disadvantage.

  • advantage: It doesn't throw an error if the type is invalid. (Our type generator and VS Code can handle this problem.) It's easy to implement the types. It parses default value out of the box (TypeScript doesn't.)
  • disadvantage: It cannot normalize type names. X[], Array<X>, Array.<X> are all same types. But it prints out all of them as they're written. We need to discuss whether it's right to normalize them, if yes, how to normalize them.

Maybe, we can try @babel/plugin-transfrom-typescript to parse the types, but I don't know it works. It needs experiment.

3.

As I don't mind changing to Babel, the second question is automatically solved.

@oandregal
Copy link
Member

@sainthkh thanks for the additional context!

I was thinking about your comment and the next steps:

  1. We change espree by @babel/parser in a separate PR (keep this PR one, just in case). This will fix the syntax issues that we now have.
  2. We look at the alternatives we have to change doctrine after, in a new PR.

It feels to me that way we're fixing the most pressing issues (using new syntax) and we buy some time to look into the TypeScript integration. However, to be honest, I don't know what kind of TypeScript issues we may run into in step 2 as I'm still leveling up in that area (although I've looked at the logs and it doesn't seem to me that we have now any blockers). I'd welcome a sanity check, though @sainthkh @aduth @sirreal do you think this is sound?

@sainthkh
Copy link
Contributor Author

Good enough. I'll test babel typescript to check if we can reuse type generator code in this PR.

@sainthkh
Copy link
Contributor Author

sainthkh commented Apr 24, 2020

@nosolosw

I was reading Gutenberg code to implement #21807. It made me react a bit slow. Now, I finished replacing espree with babel in #21853.


I experimented if babel can generate type ast. And it can. Here's the proof.. All we need to do is add "typescript" option to the plugins.

We can parse types with babel by generating code like type X = string | number and parse it.


Todo list

@aduth
Copy link
Member

aduth commented May 1, 2020

Will we still want this pull request in its current form, given that we've merged #21853? I understand there are still tasks remaining, but it's unclear to me if we'll pursue those as separate independent pull requests, or if you'd prefer to try to reconcile the changes proposed in this branch to support them.

@aduth
Copy link
Member

aduth commented May 1, 2020

  • Replace doctrine with comment-parser

Noting that I remarked some comments on initial experimenting to use comment-parser (including a patch) at #18045 (comment) . Overall it didn't work as well as I expected, though it may still end up being the best tool we have available, without going to the extent of creating something totally new.

@sainthkh
Copy link
Contributor Author

sainthkh commented May 2, 2020

  • I left this PR open to cherry-pick the related codes.
  • I'm currently experimenting with comment-parser. Maybe we can parse and verify type definition with @babel/parser and jsdoctypeparser. After experiment is done, I'll share you with the result.

@gziolo gziolo added the [Status] Stale Gives the original author opportunity to update before closing. Can be reopened as needed. label Nov 14, 2020
@gziolo
Copy link
Member

gziolo commented Nov 14, 2020

I left this PR open to cherry-pick the related codes

@sainthkh, any plans to reuse more code from this PR or can it be closed?

@sainthkh sainthkh closed this Nov 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Status] Stale Gives the original author opportunity to update before closing. Can be reopened as needed. [Tool] Docgen /packages/docgen [Type] Developer Documentation Documentation for developers
Projects
None yet
5 participants