Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeInferenceProvider using another tool than pyre - Jedi #451

Open
pelson opened this issue Jan 22, 2021 · 8 comments
Open

TypeInferenceProvider using another tool than pyre - Jedi #451

pelson opened this issue Jan 22, 2021 · 8 comments
Labels
codemod Bundled codemods, visitors, metadata providers enhancement New feature or request

Comments

@pelson
Copy link

pelson commented Jan 22, 2021

[What follows is probably heresy given that Pyre is another Instagram project, so please don't "throw me to the pyre" 🔥 - I mean no offence to that project]

I've found setting up a working pyre environment somewhat painful (building watchman from source, then getting a core dump because my project path was too long, etc.) and the documentation in LibCST of how to actually setup a TypeInferenceProvider and FullRepoManager to be lacking an example. Indeed, the best I've found was a screenshot of a notebook in #179, which I've diligently transformed into indexable form below (for my future self who wants to be able to google an example of doing it):

import libcst

from libcst.metadata.full_repo_manager import FullRepoManager
from libcst.metadata.type_inference_provider import TypeInferenceProvider


m = FullRepoManager('libCST/', ['_maybe_sentinel.py'], {TypeInferenceProvider})
for node, type_str in m.get_metadata_wrapper_for_path('_maybe_sentinel.py').resolve(TypeInferenceProvider).items():
    code = libcst.parse_module("").code_for_node(node)
    print(f'{code}: {type_str}')

As a result, I've also looked at other means of getting inference data given a node... Jedi's Script.infer is an interesting and seemingly simple option despite Jedi and LibCST working on a different level (a Script in Jedi has the ability to look through a virtual environment / PYTHONPATH to find references etc., much like Pyre and the FullRepoManager concept in LibCST).

A quick prototype later, I have a means to get hold of the Jedi inference for a node through a metadata provider (I think this is a testament to the LibCST code that this is so simple to do 👍):

import libcst as cst
import libcst.metadata

import jedi

from typing import List


class TypeInferenceFromJediProvider(cst.BatchableMetadataProvider[List[jedi.api.classes.Name]]):
    METADATA_DEPENDENCIES = (libcst.metadata.PositionProvider, )
    gen_cache = True  # We need the cache to contain a jedi.Script.

    def __init__(self, cache) -> None:
        super().__init__(cache)
        self._script: jedi.Script = self.cache['script']

    def _parse_metadata(self, node: cst.CSTNode) -> None:
        pos = self.get_metadata(libcst.metadata.PositionProvider, node).start
        self.set_metadata(node, self._script.infer(pos.line, pos.column))

    def visit_Name(self, node: cst.Name):
        self._parse_metadata(node)

    def visit_Attribute(self, node: cst.Attribute):
        self._parse_metadata(node)

    def visit_Call(self, node: cst.Call):
        self._parse_metadata(node)

Which is used as:


class InferencePrinter(cst.CSTVisitor):
    METADATA_DEPENDENCIES = (cst.metadata.PositionProvider, TypeInferenceFromJediProvider)

    def visit_Name(self, node: cst.Name) -> None:
        pos = self.get_metadata(cst.metadata.PositionProvider, node).start
        possible_types: List[jedi.api.classes.Name] = self.get_metadata(TypeInferenceFromJediProvider, node)
        print(f"{node.value} found at line {pos.line}, column {pos.column}")
        print(f"Source: {', '.join(possible_type.full_name for possible_type in possible_types)}")
        print()

prj = jedi.Project('example-project')
code = '''
from collections import namedtuple

class MyThing(namedtuple('MyThing', ['a'])):
    pass

thing = MyThing()

thing.__len__
'''
script = jedi.Script(code, project=prj)

module = cst.parse_module(code)
wrapper = cst.metadata.MetadataWrapper(module, cache={TypeInferenceFromJediProvider: {'script': script}})

wrapper.visit(InferencePrinter())

Output:

collections found at line 2, column 5
Source: collections

namedtuple found at line 2, column 24
Source: collections.namedtuple

MyThing found at line 4, column 6
Source: __main__.MyThing

namedtuple found at line 4, column 14
Source: collections.namedtuple

thing found at line 7, column 0
Source: __main__.MyThing

MyThing found at line 7, column 8
Source: __main__.MyThing

thing found at line 9, column 0
Source: __main__.MyThing

__len__ found at line 9, column 6
Source: builtins.tuple.__len__

Given the knowledge that Jedi and LibCST are both using parso under the hood, I didn't look into trying to avoid multiple parsing stages (performance isn't so critical to me, and the performance was good enough).

I just wanted to write this down so that others can benefit - I don't expect this will make its way into the LibCST codebase. (please feel free to close the issue!)

@pelson
Copy link
Author

pelson commented Jan 22, 2021

I didn't try to implement the FullRepoManager-style integration, but it looks like that would work quite well to setup the appropriate jedi.Script instance in the cache automatically. I'll post here if I end up doing that (and if I don't but somebody else does, please post here!).

@zsol
Copy link
Member

zsol commented Jan 22, 2021

At a glance, this is really cool! I can take a deeper look a bit later

@zsol
Copy link
Member

zsol commented Jan 27, 2021

I'd be happy to accept something that adds this functionality to LibCST. My only minor issue is with the name: the metadata provider doesn't really give back types but rather the names of definitions at the line/column, if I understand correctly (so you'll never get back something like Callable[[int], SomeClass] from it). How about calling it DefinitionProvider or CrossReferenceProvider?

@sk-
Copy link
Contributor

sk- commented Feb 12, 2021

That's s really great. As an extra data point, this is the feature request I reported with the mypy team python/mypy#4868. They are even considering on having a similar API.

@zsol zsol added the enhancement New feature or request label May 19, 2021
@devmessias
Copy link

Hi all. I’m facing a similar problem. It seems to be harder than I thought.

In my case, I want to annotate a XML representing a AST with type information provided by mypy. Now I trying to solve this using mypy cache files, but the cache files doesn't have enough information to relate the inferred types with tokens in the source code or nodes in the original ast.

Here is an example of how a FunctionDef node is represented by a mypy cache

image

Is there anyone working on this issue now?

@zsol
Copy link
Member

zsol commented Jun 2, 2022

Nobody's working on this AFAIK. Out of curiosity: what's preventing you from using pyre for this?

@devmessias
Copy link

Nothing, I already have pyre integration in my project https://github.com/pyastrx/pyastrx . However, most of the projects that I've worked on used mypy...and I believe most of the projects in my new job also uses mypy.

@devmessias
Copy link

I've started to write a PR for this in mypy. Until now, I'm able to extract some information
image

But mypy must approve other changes like exposing the end_col_offset

@zsol zsol added the codemod Bundled codemods, visitors, metadata providers label Jun 16, 2022
rominf pushed a commit to rominf/LibCST that referenced this issue Dec 7, 2022
This change is RFC (please read whole change message).

Add `MypyTypeInferenceProvider` as an alternative for
`TypeInferenceProvider`. The provider infers types using mypy as
library. The only requirement for the usage is to have the latest mypy
installed. Types inferred are mypy types, since mypy type system is well
designed, to avoid the conversion, and also to keep it simple. For
compatibility and extensibility reasons, these types are stored in
separate field `MypyType.mypy_type`.

Let's assume we have the following code in the file `x.py` which we want
to inspect:
```python
x = [42]

s = set()

from enum import Enum

class E(Enum):
    f = "f"

e = E.f
```

Then to get play with mypy types one should use the code like:
```python
import libcst as cst

from libcst.metadata import MypyTypeInferenceProvider

filename = "x.py"
module = cst.parse_module(open(filename).read())
cache = MypyTypeInferenceProvider.gen_cache(".", [filename])[filename]
wrapper = cst.MetadataWrapper(
    module,
    cache={MypyTypeInferenceProvider: cache},
)

mypy_type = wrapper.resolve(MypyTypeInferenceProvider)
x_name_node = wrapper.module.body[0].body[0].targets[0].target
set_call_node = wrapper.module.body[1].body[0].value
e_name_node = wrapper.module.body[-1].body[0].targets[0].target

print(mypy_type[x_name_node])
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].fullname)
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].mypy_type.type.fullname)
 # prints: builtins.list

print(mypy_type[x_name_node].mypy_type.args)
 # prints: (builtins.int,)

print(mypy_type[x_name_node].mypy_type.type.bases[0].type.fullname)
 # prints: typing.MutableSequence

print(mypy_type[set_call_node])
 # prints: builtins.set

print("issuperset" in mypy_type[set_call_node].mypy_type.names)
 # prints: True

print(mypy_type[set_call_node.func])
 # prints: typing.Type[builtins.set]

print(mypy_type[e_name_node].mypy_type.type.is_enum)
 # prints: True
```

Why?

1. `TypeInferenceProvider` requires pyre (`pyre-check` on PyPI) to be
   installed. mypy is more popular than pyre. If the organization uses
   mypy already (which is almost always the case), it may be difficult
   to assure collegues (including security team) that "we need yet
   another type checker". `MypyTypeInferenceProvider` requires the
   latest mypy only.
2. Even though it is possible to run pyre without watchman installation,
   this is not advertised. watchman installation is not always possible
   because of system requirements, or because of the security
   requirements like "we install only our favorite GNU/Linux
   distribution packages".
3. `TypeInferenceProvider` usage requires `pyre start` command to be run
   before the execution, and `pyre stop` - after the execution. This may
   be inconvenient, especially for the cases when pyre was not used
   before.
4. Types produced by pyre in `TypeInferenceProvider` are just strings.
   For example, it's not easily possible to infer that some variable is
   enum instance. `MypyTypeInferenceProvider` makes it easy:
   ```
   [FIXME: code here]
   ```

Drawback:

1. Speed. mypy is slower than pyre, so is `MypyTypeInferenceProvider`
   comparing to `TypeInferenceProvider`.
   How to partially solve this:
   1. Implement AST tree caching in mypy. It may be difficult, however
      this will lead to speed improvements for all the projects that use
      this functionality.
   2. Implement inferred types caching inside LibCST. As far as I know,
      no caching at all is implemented inside LibCST, which is the
      prerequisite for inferred types caching, so the task is big.
   3. Implement LibCST CST to mypy AST. I am not sure if this possible
      at all. Even if it is possible, the task is huge.
2. Two providers are doing similar things in LibCST will be present,
   this can potentially lead to the situation when there is a need
   install two typecheckers to get all codemods from the library
   running.
   Alternatives considered:
   1. Put `MypyTypeInferenceProvider` inside separate library (say,
       LibCST-mypy or `libcst-mypy` on PyPI). This will explicitly
       separate `MypyTypeInferenceProvider` from the rest of LibCST.
      Drawbacks:
      1. The need to maintain separate library.
      2. Limited fame (people need to know that the library exists).
      3. Since some codemods cannot be implemented easily without the
         library, for example, `if-elif-else` to `match` converter
	 (it needs powerful type inference), they are doomed to not be
	 shipped with LibCST, which makes the latter less attractive for
	 end users.
   2. Implement base class for inferred type, which inherits from `str`
      (to keep the compatibility with the existing codebase) and
      the mechanism for dynamically selecting `TypeInferenceProvider`
      typechecker (mypy or pyre; user can do this via enviromental
      variable). If the code inside LibCST requires just shallow type
      information (so, just `str` is enough), then the code can run with
      any typechecker. Ther remaining code (such as `if-elif-else` to
      `match` converter) will still require mypy.

Misc:

Code does not lint in my env, by some reason `pyre check` cannot find
`mypy` library.

Related to:

* Instagram#451
* pyastrx/pyastrx#40
* python/mypy#12513
* python/mypy#4868
rominf pushed a commit to rominf/LibCST that referenced this issue Dec 7, 2022
This change is RFC (please read whole change message).

Add `MypyTypeInferenceProvider` as an alternative for
`TypeInferenceProvider`. The provider infers types using mypy as
library. The only requirement for the usage is to have the latest mypy
installed. Types inferred are mypy types, since mypy type system is well
designed, to avoid the conversion, and also to keep it simple. For
compatibility and extensibility reasons, these types are stored in
separate field `MypyType.mypy_type`.

Let's assume we have the following code in the file `x.py` which we want
to inspect:
```python
x = [42]

s = set()

from enum import Enum

class E(Enum):
    f = "f"

e = E.f
```

Then to get play with mypy types one should use the code like:
```python
import libcst as cst

from libcst.metadata import MypyTypeInferenceProvider

filename = "x.py"
module = cst.parse_module(open(filename).read())
cache = MypyTypeInferenceProvider.gen_cache(".", [filename])[filename]
wrapper = cst.MetadataWrapper(
    module,
    cache={MypyTypeInferenceProvider: cache},
)

mypy_type = wrapper.resolve(MypyTypeInferenceProvider)
x_name_node = wrapper.module.body[0].body[0].targets[0].target
set_call_node = wrapper.module.body[1].body[0].value
e_name_node = wrapper.module.body[-1].body[0].targets[0].target

print(mypy_type[x_name_node])
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].fullname)
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].mypy_type.type.fullname)
 # prints: builtins.list

print(mypy_type[x_name_node].mypy_type.args)
 # prints: (builtins.int,)

print(mypy_type[x_name_node].mypy_type.type.bases[0].type.fullname)
 # prints: typing.MutableSequence

print(mypy_type[set_call_node])
 # prints: builtins.set

print("issuperset" in mypy_type[set_call_node].mypy_type.names)
 # prints: True

print(mypy_type[set_call_node.func])
 # prints: typing.Type[builtins.set]

print(mypy_type[e_name_node].mypy_type.type.is_enum)
 # prints: True
```

Why?

1. `TypeInferenceProvider` requires pyre (`pyre-check` on PyPI) to be
   installed. mypy is more popular than pyre. If the organization uses
   mypy already (which is almost always the case), it may be difficult
   to assure collegues (including security team) that "we need yet
   another type checker". `MypyTypeInferenceProvider` requires the
   latest mypy only.
2. Even though it is possible to run pyre without watchman installation,
   this is not advertised. watchman installation is not always possible
   because of system requirements, or because of the security
   requirements like "we install only our favorite GNU/Linux
   distribution packages".
3. `TypeInferenceProvider` usage requires `pyre start` command to be run
   before the execution, and `pyre stop` - after the execution. This may
   be inconvenient, especially for the cases when pyre was not used
   before.
4. Types produced by pyre in `TypeInferenceProvider` are just strings.
   For example, it's not easily possible to infer that some variable is
   enum instance. `MypyTypeInferenceProvider` makes it easy, see the
   code above.

Drawback:

1. Speed. mypy is slower than pyre, so is `MypyTypeInferenceProvider`
   comparing to `TypeInferenceProvider`.
   How to partially solve this:
   1. Implement AST tree caching in mypy. It may be difficult, however
      this will lead to speed improvements for all the projects that use
      this functionality.
   2. Implement inferred types caching inside LibCST. As far as I know,
      no caching at all is implemented inside LibCST, which is the
      prerequisite for inferred types caching, so the task is big.
   3. Implement LibCST CST to mypy AST. I am not sure if this possible
      at all. Even if it is possible, the task is huge.
2. Two providers are doing similar things in LibCST will be present,
   this can potentially lead to the situation when there is a need
   install two typecheckers to get all codemods from the library
   running.
   Alternatives considered:
   1. Put `MypyTypeInferenceProvider` inside separate library (say,
       LibCST-mypy or `libcst-mypy` on PyPI). This will explicitly
       separate `MypyTypeInferenceProvider` from the rest of LibCST.
      Drawbacks:
      1. The need to maintain separate library.
      2. Limited fame (people need to know that the library exists).
      3. Since some codemods cannot be implemented easily without the
         library, for example, `if-elif-else` to `match` converter
	 (it needs powerful type inference), they are doomed to not be
	 shipped with LibCST, which makes the latter less attractive for
	 end users.
   2. Implement base class for inferred type, which inherits from `str`
      (to keep the compatibility with the existing codebase) and
      the mechanism for dynamically selecting `TypeInferenceProvider`
      typechecker (mypy or pyre; user can do this via enviromental
      variable). If the code inside LibCST requires just shallow type
      information (so, just `str` is enough), then the code can run with
      any typechecker. Ther remaining code (such as `if-elif-else` to
      `match` converter) will still require mypy.

Misc:

Code does not lint in my env, by some reason `pyre check` cannot find
`mypy` library.

Related to:

* Instagram#451
* pyastrx/pyastrx#40
* python/mypy#12513
* python/mypy#4868
rominf pushed a commit to rominf/LibCST that referenced this issue Dec 7, 2022
This change is RFC (please read whole change message).

Add `MypyTypeInferenceProvider` as an alternative for
`TypeInferenceProvider`. The provider infers types using mypy as
library. The only requirement for the usage is to have the latest mypy
installed. Types inferred are mypy types, since mypy type system is well
designed, to avoid the conversion, and also to keep it simple. For
compatibility and extensibility reasons, these types are stored in
separate field `MypyType.mypy_type`.

Let's assume we have the following code in the file `x.py` which we want
to inspect:
```python
x = [42]

s = set()

from enum import Enum

class E(Enum):
    f = "f"

e = E.f
```

Then to get play with mypy types one should use the code like:
```python
import libcst as cst

from libcst.metadata import MypyTypeInferenceProvider

filename = "x.py"
module = cst.parse_module(open(filename).read())
cache = MypyTypeInferenceProvider.gen_cache(".", [filename])[filename]
wrapper = cst.MetadataWrapper(
    module,
    cache={MypyTypeInferenceProvider: cache},
)

mypy_type = wrapper.resolve(MypyTypeInferenceProvider)
x_name_node = wrapper.module.body[0].body[0].targets[0].target
set_call_node = wrapper.module.body[1].body[0].value
e_name_node = wrapper.module.body[-1].body[0].targets[0].target

print(mypy_type[x_name_node])
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].fullname)
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].mypy_type.type.fullname)
 # prints: builtins.list

print(mypy_type[x_name_node].mypy_type.args)
 # prints: (builtins.int,)

print(mypy_type[x_name_node].mypy_type.type.bases[0].type.fullname)
 # prints: typing.MutableSequence

print(mypy_type[set_call_node])
 # prints: builtins.set

print("issuperset" in mypy_type[set_call_node].mypy_type.names)
 # prints: True

print(mypy_type[set_call_node.func])
 # prints: typing.Type[builtins.set]

print(mypy_type[e_name_node].mypy_type.type.is_enum)
 # prints: True
```

Why?

1. `TypeInferenceProvider` requires pyre (`pyre-check` on PyPI) to be
   installed. mypy is more popular than pyre. If the organization uses
   mypy already (which is almost always the case), it may be difficult
   to assure colleagues (including security team) that "we need yet
   another type checker". `MypyTypeInferenceProvider` requires the
   latest mypy only.
2. Even though it is possible to run pyre without watchman installation,
   this is not advertised. watchman installation is not always possible
   because of system requirements, or because of the security
   requirements like "we install only our favorite GNU/Linux
   distribution packages".
3. `TypeInferenceProvider` usage requires `pyre start` command to be run
   before the execution, and `pyre stop` - after the execution. This may
   be inconvenient, especially for the cases when pyre was not used
   before.
4. Types produced by pyre in `TypeInferenceProvider` are just strings.
   For example, it's not easily possible to infer that some variable is
   enum instance. `MypyTypeInferenceProvider` makes it easy, see the
   code above.

Drawback:

1. Speed. mypy is slower than pyre, so is `MypyTypeInferenceProvider`
   comparing to `TypeInferenceProvider`.
   How to partially solve this:
   1. Implement AST tree caching in mypy. It may be difficult, however
      this will lead to speed improvements for all the projects that use
      this functionality.
   2. Implement inferred types caching inside LibCST. As far as I know,
      no caching at all is implemented inside LibCST, which is the
      prerequisite for inferred types caching, so the task is big.
   3. Implement LibCST CST to mypy AST. I am not sure if this possible
      at all. Even if it is possible, the task is huge.
2. Two providers are doing similar things in LibCST will be present,
   this can potentially lead to the situation when there is a need
   install two typecheckers to get all codemods from the library
   running.
   Alternatives considered:
   1. Put `MypyTypeInferenceProvider` inside separate library (say,
       LibCST-mypy or `libcst-mypy` on PyPI). This will explicitly
       separate `MypyTypeInferenceProvider` from the rest of LibCST.
      Drawbacks:
      1. The need to maintain separate library.
      2. Limited fame (people need to know that the library exists).
      3. Since some codemods cannot be implemented easily without the
         library, for example, `if-elif-else` to `match` converter
	 (it needs powerful type inference), they are doomed to not be
	 shipped with LibCST, which makes the latter less attractive for
	 end users.
   2. Implement base class for inferred type, which inherits from `str`
      (to keep the compatibility with the existing codebase) and
      the mechanism for dynamically selecting `TypeInferenceProvider`
      typechecker (mypy or pyre; user can do this via enviromental
      variable). If the code inside LibCST requires just shallow type
      information (so, just `str` is enough), then the code can run with
      any typechecker. The remaining code (such as `if-elif-else` to
      `match` converter) will still require mypy.

Misc:

Code does not lint in my env, by some reason `pyre check` cannot find
`mypy` library.

Related to:

* Instagram#451
* pyastrx/pyastrx#40
* python/mypy#12513
* python/mypy#4868
rominf pushed a commit to rominf/LibCST that referenced this issue Dec 7, 2022
This change is RFC (please read whole change message).

Add `MypyTypeInferenceProvider` as an alternative for
`TypeInferenceProvider`. The provider infers types using mypy as
library. The only requirement for the usage is to have the latest mypy
installed. Types inferred are mypy types, since mypy type system is well
designed, to avoid the conversion, and also to keep it simple. For
compatibility and extensibility reasons, these types are stored in
separate field `MypyType.mypy_type`.

Let's assume we have the following code in the file `x.py` which we want
to inspect:
```python
x = [42]

s = set()

from enum import Enum

class E(Enum):
    f = "f"

e = E.f
```

Then to get play with mypy types one should use the code like:
```python
import libcst as cst

from libcst.metadata import MypyTypeInferenceProvider

filename = "x.py"
module = cst.parse_module(open(filename).read())
cache = MypyTypeInferenceProvider.gen_cache(".", [filename])[filename]
wrapper = cst.MetadataWrapper(
    module,
    cache={MypyTypeInferenceProvider: cache},
)

mypy_type = wrapper.resolve(MypyTypeInferenceProvider)
x_name_node = wrapper.module.body[0].body[0].targets[0].target
set_call_node = wrapper.module.body[1].body[0].value
e_name_node = wrapper.module.body[-1].body[0].targets[0].target

print(mypy_type[x_name_node])
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].fullname)
 # prints: builtins.list[builtins.int]

print(mypy_type[x_name_node].mypy_type.type.fullname)
 # prints: builtins.list

print(mypy_type[x_name_node].mypy_type.args)
 # prints: (builtins.int,)

print(mypy_type[x_name_node].mypy_type.type.bases[0].type.fullname)
 # prints: typing.MutableSequence

print(mypy_type[set_call_node])
 # prints: builtins.set

print("issuperset" in mypy_type[set_call_node].mypy_type.names)
 # prints: True

print(mypy_type[set_call_node.func])
 # prints: typing.Type[builtins.set]

print(mypy_type[e_name_node].mypy_type.type.is_enum)
 # prints: True
```

Why?

1. `TypeInferenceProvider` requires pyre (`pyre-check` on PyPI) to be
   installed. mypy is more popular than pyre. If the organization uses
   mypy already (which is almost always the case), it may be difficult
   to assure colleagues (including security team) that "we need yet
   another type checker". `MypyTypeInferenceProvider` requires the
   latest mypy only.
2. Even though it is possible to run pyre without watchman installation,
   this is not advertised. watchman installation is not always possible
   because of system requirements, or because of the security
   requirements like "we install only our favorite GNU/Linux
   distribution packages".
3. `TypeInferenceProvider` usage requires `pyre start` command to be run
   before the execution, and `pyre stop` - after the execution. This may
   be inconvenient, especially for the cases when pyre was not used
   before.
4. Types produced by pyre in `TypeInferenceProvider` are just strings.
   For example, it's not easily possible to infer that some variable is
   enum instance. `MypyTypeInferenceProvider` makes it easy, see the
   code above.

Drawbacks:

1. Speed. mypy is slower than pyre, so is `MypyTypeInferenceProvider`
   comparing to `TypeInferenceProvider`.
   How to partially solve this:
   1. Implement AST tree caching in mypy. It may be difficult, however
      this will lead to speed improvements for all the projects that use
      this functionality.
   2. Implement inferred types caching inside LibCST. As far as I know,
      no caching at all is implemented inside LibCST, which is the
      prerequisite for inferred types caching, so the task is big.
   3. Implement LibCST CST to mypy AST. I am not sure if this possible
      at all. Even if it is possible, the task is huge.
2. Two providers are doing similar things in LibCST will be present,
   this can potentially lead to the situation when there is a need
   install two typecheckers to get all codemods from the library
   running.
   Alternatives considered:
   1. Put `MypyTypeInferenceProvider` inside separate library (say,
       LibCST-mypy or `libcst-mypy` on PyPI). This will explicitly
       separate `MypyTypeInferenceProvider` from the rest of LibCST.
      Drawbacks:
      1. The need to maintain separate library.
      2. Limited fame (people need to know that the library exists).
      3. Since some codemods cannot be implemented easily without the
         library, for example, `if-elif-else` to `match` converter
	 (it needs powerful type inference), they are doomed to not be
	 shipped with LibCST, which makes the latter less attractive for
	 end users.
   2. Implement base class for inferred type, which inherits from `str`
      (to keep the compatibility with the existing codebase) and
      the mechanism for dynamically selecting `TypeInferenceProvider`
      typechecker (mypy or pyre; user can do this via enviromental
      variable). If the code inside LibCST requires just shallow type
      information (so, just `str` is enough), then the code can run with
      any typechecker. The remaining code (such as `if-elif-else` to
      `match` converter) will still require mypy.

Misc:

Code does not lint in my env, by some reason `pyre check` cannot find
`mypy` library.

Related to:

* Instagram#451
* pyastrx/pyastrx#40
* python/mypy#12513
* python/mypy#4868
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codemod Bundled codemods, visitors, metadata providers enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants