Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor transformations and add name transformation #56

Merged
merged 11 commits into from
Jul 7, 2021

Conversation

brxck
Copy link
Collaborator

@brxck brxck commented Jun 30, 2021

The scope of this PR has changed a bit!

This refactors the transformations code to increase reusability and lower the cost to support new transformations (modifiers). This also adds the name, function name, and class name transformations.

NodeMatchers are split into component functions NodeFinders and SelectionExtractors. Functions have been added to combine and compose these into new NodeMatchers.


Here's the first step of #27, relatively simple.

I'd be happy to look into extending this for "funk name," "funk class," etc, but I am not entirely sure what approach to take or if there are already provisions for this sort of thing.

This includes/depends on work in #32. I wish stacked PRs were a thing in GitHub, lmk if you want to handle to this kind of PR differently.

@brxck
Copy link
Collaborator Author

brxck commented Jun 30, 2021

Oh this also includes a commit which adds some logging of the types of the selected node and its ancestors. This has been pretty handy developing these transformations. If you prefer, I can easily drop that commit.

@brxck
Copy link
Collaborator Author

brxck commented Jul 2, 2021

@pokey I'm still a little hung up on how to go about implementing a function name transformation.

Selecting a function in both our languages is non trivial. For example, the Typescript implementation is three cascading matchers:

  namedFunction: cascadingMatcher(
    possiblyExportedDeclaration("function_declaration", "method_definition"),
    (editor: TextEditor, node: SyntaxNode) =>
      node.type === "public_field_definition" &&
      getValueNode(node)!.type === "arrow_function"
        ? simpleSelectionExtractor(node)
        : null,
    possiblyWrappedNode(
      (node) => node.type === "export_statement",
      isNamedArrowFunction,
      (node) => [getDeclarationNode(node)]
    )
  ),

This ends up being a substantial amount of logic to extract and duplicate for this one transformation. It feels like we should have the ability to compose/extend matchers if we want to implement more things like this.

If we separated finding a node and extracting a selection, then we could compose the node finder functions. Which could look very roughly like:

import { flowRight } from "lodash";

const nodeMatchers: Record<ScopeType, NodeMatcher> = {
  functionName: flowRight([simpleSelectionExtractor, findName, findNamedFunction]),
  namedFunction: flowRight([simpleSelectionExtractor, findNamedFunction])
}

Curious what you think about this!

@pokey
Copy link
Member

pokey commented Jul 2, 2021

Sounds good to me if you can make it work! My worry is that extracting the name will be different depending which of the cascading marchers hits. For example if the matcher hits an export, it returns the whole export statement, so I believe it would then have to drill back down to get the name. I think it's simpler to avoid that because the name extractor doesn't need to try to include the export, whereas the function matcher goes to great lengths to do so. So I think the function matching process will actually be a bit different if you just want the name

But if you think you can make it work give it a shot! Otherwise I'd just copy the matchers in the cascade and tear off the stuff you don't need and get the name from there. If after that it still looks really similar to the original then yeah maybe makes sense to abstract it

Not sure if any of this made sense it's pretty late here 😅

@brxck brxck force-pushed the name-transform branch from 9f6e61a to bd7dd89 Compare July 5, 2021 17:57
Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brxck this is awesome!! great improvement. I left a bunch of nits. I don't feel strongly on findX vs xFinder but I do think we should try to be super consistent about our policy. Also, I'd update the PR description: def no longer a simple change 😄

}

export type NodeFinder = (node: SyntaxNode) => SyntaxNode | null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might move the NodeMatcher above down to right above this one; it got separated in the rebase

I also might add a docstring for this type and the next one as they're important abstractions

src/extension.ts Outdated
@@ -197,6 +198,25 @@ export async function activate(context: vscode.ExtensionContext) {
addDecorations();
};

function logBranchTypes(event: vscode.TextEditorSelectionChangeEvent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind moving this one to a separate file? Like debug.ts or something?

src/extension.ts Outdated
Comment on lines 202 to 207
const getBranch = (branch: SyntaxNode[]): SyntaxNode[] => {
if (branch[0].parent) {
return getBranch([branch[0].parent, ...branch]);
}
return branch;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoah recursion. this one is a bit tough to follow 😅. would getAncestors be a better name? Presuming I've understood the function correctly...

I think this one might be a bit clearer without recursion too. Just iterate until the parent is null and add the nodes as you go up. I also prefer == null comparisons to boolean transformations

Also I generally try to avoid nesting functions unless it needs to capture scope

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I hear you, this was mostly meant for me at the time of writing. This is probably an artifact of the fact that my first intro to programming around syntax trees was in SML & Racket 😆 I can clean this one up and move it to a separate file.

src/extension.ts Outdated
branch.forEach((node, i) => console.log(">".repeat(i + 1), node.type));
const leafText = leaf.text.replace(/\s+/g, " ").substring(0, 100);
console.log(">".repeat(branch.length), `"${leafText}"`);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. yeah this one could be useful. I think I might use console.debug for these tho

return function (editor: TextEditor, initialNode: SyntaxNode) {
let returnNode: SyntaxNode | null = initialNode;
for (const finder of finders) {
returnNode = returnNode ? finder(returnNode) : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not short-circuit if one of the finders returns null?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to short circuit. Otherwise you could be effectively skipping a finder, which could yield the incorrect node.

For instance, take this completely made up example. We want to get the return type of a function:

returnType: composedMatcher([
    findNodeOfType("function_declaration"),
    findNodeOfType("return_type"),
    getTypeNode
])

Imagine a situation where we find a declaration but we don't find anything for findNodeOfType("return_type"). If we continue to the next finder, we could match any other encountered type (like a parameter type) in the function declaration, returning an incorrect node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah oh right you are short-circuiting. Missed that. I think maybe would be a bit easier to understand if you just had a return statement when you get a null? Then it's really obvious you're short-cirtuiting and can remove ternary at bottom

listTypes.includes(node.parent?.type ?? "") && listElementMatcher(node),
(node) => node.type === "," || node.type === "[" || node.type === "]",
", "
pairKey: composedMatcher([findNodeOfType("pair"), getKeyNode]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh shiny 😊

nodeMatches: (node: SyntaxNode) => boolean,
isDelimiterNode: (node: SyntaxNode) => boolean,
defaultDelimiter: string
export function composedMatcher(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i tend to like variadic signatures for this type of thing, so you can say eg composedMatcher(findNodeOfType("pair"), getKeyNode) (notice no square brackets). But obv that doesn't work with this selector param. Not sure what the right answer is here. Prob fine as you have it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would have preferred that too.

I think maybe it can actually be done (see this PR for examples), but when I tried it seemed to break type inference within the function so I backed off. I did not spend much time looking into this though.

It would look something like this (?):

export function composedMatcher(
    ...args: [...NodeFinder[], SelectionExtractor]
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think that's potentially more confusing. Maybe let's just leave it

matcher(getNameNode),
matcher((node) => (node.type === "assignment" ? getLeftNode(node) : null))
),
functionName: notSupported,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be hard to implement this one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it would! I think I just got distracted by what this PR turned into 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha fair enough; there was a lot going on in this PR 😄

isNamedArrowFunction,
(node) => [getDeclarationNode(node)]
matcher(
findPossiblyWrappedNode(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this one can't use possiblyExportedDeclaration 🤔. Looks like it was already this way but is a bit curious

statement: matcher(possiblyExportedDeclaration(...STATEMENT_TYPES)),
arrowFunction: typeMatcher("arrow_function"),
functionCall: typeMatcher("call_expression", "new_expression"),
functionName: cascadingMatcher(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you not also support name for typescript?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's using the one defined in getPojoMatchers. Unsure if it should stay there? Currently typescript is the only language using it, but it's the most generic implementation possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think it should be in pojoMatchers. "Pojo" stands for "plain old javascript object", so more or less should only contain json stuff. But if you want to do a rename / refactor I'm fine with it

@pokey pokey added this to the First release for new users milestone Jul 6, 2021
@pokey
Copy link
Member

pokey commented Jul 6, 2021

Do we also want to implement class names as part of this PR or leave that for follow on work?

@brxck brxck force-pushed the name-transform branch from bd7dd89 to 3a79d5d Compare July 6, 2021 20:13
@brxck brxck requested a review from pokey July 6, 2021 20:14
Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok looking really good! Left a couple more minor comments. Also don't forget to update PR description. Nothing fancy just it's a bit outdated

import * as vscode from "vscode";
import { SyntaxNode } from "web-tree-sitter";

export function logBranchTypes(getNodeAtLocation: any) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally try to avoid any, tho I recognize this function is just for debugging. Should be a pretty straightforward signature tho no?

Also, I'd probably opt for a class here rather than returning a function, but let's leave for now as it's just for debugging

src/debug.ts Outdated
Comment on lines 11 to 16
const ancestors: SyntaxNode[] = [];
let node: SyntaxNode = getNodeAtLocation(location);
while (node.parent) {
ancestors.unshift(node.parent);
node = node.parent;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner 😊. I still prefer node.parent != null, even tho slightly more verbose

Copy link
Collaborator Author

@brxck brxck Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, I forgot about that! I will probably continue to forget about it unless an eslint rule is instituted 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yeah good point. I think this would do the trick?

Comment on lines 153 to 160
functionName: composedMatcher([
possiblyDecoratedDefinition("function_definition"),
getNameNode,
]),
className: composedMatcher([
possiblyDecoratedDefinition("class_definition"),
getNameNode,
]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well these were too easy 😂. did you try it on a decorated definition? I'd think it would not be able to find the name on the decoration node

getNameNode,
]),
composedMatcher([findClassPropertyArrowFunction, getNameNode]),
composedMatcher([findNamedArrowFunction, getNameNode])
),
className: composedMatcher([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice this one was pretty easy too 😊. also, did you test this one on something that was exported? also thinking it might not find the name on the export statement

export const findNode =
(isTargetNode: (node: SyntaxNode) => boolean): NodeFinder =>
(node: SyntaxNode) => {
export const findNode = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be nodeFinder or predicateNodeFinder or something for consistency with new naming scheme?

@brxck brxck changed the title Add name transformation Refactor transformations and add name transformation Jul 7, 2021
@brxck
Copy link
Collaborator Author

brxck commented Jul 7, 2021

Thank you for the solid review @pokey! Think I've addressed everything now.

@pokey pokey merged commit 99eea68 into cursorless-dev:master Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants