Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce plugin for migrating scalatest #572

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ target
Cargo.lock
tmp_test*
env/
**.egg-info


# Dependencies
Expand Down
22 changes: 22 additions & 0 deletions plugins/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[build-system]
requires = ["setuptools>=42", "wheel"]
build-backend = "setuptools.build_meta"

[tool.poetry]
name = "scala_test"
version = "0.0.1"
description = "Rules to migrate 'scaletest'"
# Add any other metadata you need

[tool.poetry.dependencies]
python = "^3.9"
polyglot_piranha = "*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "*" mean here? Any version? Latest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes * means latest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be silently latest? Could we have this just be in sync with the current released Piranha version? (Also, pytest below should probably be set to a concrete library and we should manually keep the dep up to date, no?). Basically, just in terms of reproducibility I am wary of dependencies without a explicit version.


[tool.poetry.dev-dependencies]
pytest = "*"

[tool.poetry.scripts."scala_test"]
main = "scala_test.main:main"

[tool.poetry.scripts."pytest"]
main = "pytest"
37 changes: 37 additions & 0 deletions plugins/scala_test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# `scalatest` Migration Plugin (WIP)

This piranha plugin updates `scalatest` to a new version.


Currently, it updates to [v.3.2.2](https://mvnrepository.com/artifact/org.scalatest/scalatest_2.12/3.2.2) only. The following import statements are updated:
* `org.scalatest.Matchers`-> `org.scalatest.matchers.should.Matchers`
* `org.scalatest.mock.MockitoSugar`-> `org.scalatestplus.mockito.MockitoSugar`
* `org.scalatest.FunSuite`->`org.scalatest.funsuite.AnyFunSuite`
* `org.scalatest.junit.JUnitRunner`->`org.scalatestplus.junit.JUnitRunner`
* `org.scalatest.FlatSpec`-> `org.scalatest.flatspec.AnyFlatSpec`
* `org.scalatest.junit.AssertionsForJUnit`-> `org.scalatestplus.junit.AssertionsForJUnit`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a description/explanation of what this is, before the Usage instructions.

Would also be a good point to note if this is a WIP or already functional and for which cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

## Usage:

Clone the repository - `git clone https://github.com/uber/piranha.git`

Install the dependencies - `pip3 install -r plugins/scala_test/requirements.txt`

Run the tool - `python3 plugins/scala_test/main.py -h`

CLI:
```
usage: main.py [-h] --path_to_codebase PATH_TO_CODEBASE

Updates the codebase to use a new version of `scalatest_2.12`.

options:
-h, --help show this help message and exit
--path_to_codebase PATH_TO_CODEBASE
Path to the codebase directory.
```

## Test
```
pytest plugins/scala_test
```
25 changes: 25 additions & 0 deletions plugins/scala_test/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import argparse
from update_imports import update_imports

def _parse_args():
parser = argparse.ArgumentParser(description="Updates the codebase to use a new version of `scalatest_2.12`")
parser.add_argument(
"--path_to_codebase",
required=True,
help="Path to the codebase directory.",
)
parser.add_argument(
"--new_version",
required=True,
default="3.2.2",
help="Version of `scalatest` to update to.",
)
args = parser.parse_args()
return args

def main():
args = _parse_args()
update_imports(args.path_to_codebase, args.new_version, dry_run=True)

if __name__ == "__main__":
main()
110 changes: 110 additions & 0 deletions plugins/scala_test/recipes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
from polyglot_piranha import (
Rule,
OutgoingEdges,
RuleGraph,
PiranhaArguments,
execute_piranha,
)


def replace_imports(
target_new_types: dict[str, str],
search_heuristic: str,
path_to_codebase: str,
dry_run=False,
):
"""This function replaces the imports of the target types with the new types.
The search heuristic is used to find the files that contain the target types.

Args:
target_new_types (dict[str, str]): A dictionary from target type to new type (fully qualified names)
search_heuristic (str): The search heuristic to find the files that contain the target types
path_to_codebase (str): The path to the codebase
dry_run (bool, optional): True if the changes should not be written to disk. Defaults to False.

Returns:
_type_: A list of PiranhaOutput objects
"""
find_relevant_files = Rule(
name="find_relevant_files",
query='((identifier) @x (#eq? @x "@search_heuristic"))',
holes={"search_heuristic"},
)
find_relevant_files_andThen_update_import = OutgoingEdges(
"find_relevant_files", to=["update_import"], scope="File"
)

rules = [find_relevant_files]
edges = [find_relevant_files_andThen_update_import]

for target_type, new_type in target_new_types.items():
rs, es = replace_import_rules_and_edges(target_type, new_type)
rules.extend(rs)
edges.extend(es)

rule_graph = RuleGraph(rules=rules, edges=edges)

args = PiranhaArguments(
language="scala",
path_to_codebase=path_to_codebase,
rule_graph=rule_graph,
substitutions={"search_heuristic": f"{search_heuristic}"},
dry_run=dry_run,
)

return execute_piranha(args)


def replace_import_rules_and_edges(
target_qualified_type_name: str, new_qualified_type_name: str
) -> (list[Rule], list[OutgoingEdges]):
"""This function generates the rules and edges to replace the imports of the target type with the new type.
It supports both simple and nested imports. While the simple imports are replaced directly, the nested imports are deleted and the new type is imported (as a simple non-nested import).
Assume that the target type is "a.b.c.d" and the new type is "x.y.z". Then the following rules are generated:
import a.b.c.d -> import x.y.z
import a.b.c.{d, e} -> import x.y.z \n import a.b.c.{d}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import a.b.c.{d, e} -> import x.y.z \n import a.b.c.{d}
import a.b.c.{d, e} -> import x.y.z \n import a.b.c.{e}

I believe you meant e here, since d is the one getting replaced...

"""
name_components = target_qualified_type_name.split(".")
type_name = name_components[-1]

qualifier_predicate = "\n".join(
[f'(#match? @import_decl "{n}")' for n in name_components[:-1]]
)

delete_nested_import = Rule(
name=f"delete_nested_import_{type_name}",
query=f"""(
(import_declaration (namespace_selectors (_) @tn )) @import_decl
(#eq? @tn "{type_name}")
{qualifier_predicate}
)""",
replace_node="tn",
replace="",
is_seed_rule=False,
groups={"update_import"},
)

update_simple_import = Rule(
name=f"update_simple_import_{type_name}",
query=f"cs import {target_qualified_type_name}",
replace_node="*",
replace=f"import {new_qualified_type_name}",
is_seed_rule=False,
groups={"update_import"},
)

insert_import = Rule(
name=f"insert_import_{type_name}",
query="(import_declaration) @import_decl",
replace_node="import_decl",
replace=f"@import_decl\nimport {new_qualified_type_name}\n",
is_seed_rule=False,
)

e2 = OutgoingEdges(
f"delete_nested_import_{type_name}",
to=[f"insert_import_{type_name}"],
scope="Parent",
)

return [delete_nested_import, update_simple_import, insert_import], [e2]
2 changes: 2 additions & 0 deletions plugins/scala_test/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
polyglot-piranha
pytest
Empty file.
8 changes: 8 additions & 0 deletions plugins/scala_test/tests/resources/expected/sample.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package com.scala.piranha

import com.uber.michelangelo.AbstractSparkSuite
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.{DoubleType, StringType, StructField, StructType}
import org.scalatest.{BeforeAndAfter}
import org.scalatest.matchers.should.Matchers
import org.scalatestplus.mockito.MockitoSugar
7 changes: 7 additions & 0 deletions plugins/scala_test/tests/resources/input/sample.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package com.scala.piranha

import com.uber.michelangelo.AbstractSparkSuite
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.{DoubleType, StringType, StructField, StructType}
import org.scalatest.{BeforeAndAfter, Matchers}
import org.scalatest.mock.MockitoSugar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have a case where an import like import pkg1.pkg2.{A, B} is replaced as import pkg1.pkg2.{C, B} and also the simple import pkg3.pkg4.D -> import pkg5.pkg6.E case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm. Actually the solution u suggest looks clean when the before and after type have a significant overlap in their qualified name. Else we have to "infer" the level to split the type name.

From: a.b.c.D to a.e.f.g.H.
Before: import a.b.c.{D, E}
After: import a.{b.c.E, e.f.H}
or add some extra logic to heuristically decide when it is not a good idea to add a nested import, and add a simple import at those times.

From: a.b.c.D to a.b.c.H.
Before: import a.b.c.{D, E}
After: import a.b.c.{H, E}


( actually @raviagarwal7 suggested adding simple import and keep the rewrite logic less complicated. We believe that we can use Scala linters to re-org the imports like we want to)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this. I am not saying the rewrite done in these tests is incorrect, but I think there are missing cases given the rewrite rules you added. Specifically the update_simple_import_{...} rules. My question here is about test coverage for the rules/logic added.

37 changes: 37 additions & 0 deletions plugins/scala_test/tests/test_update_imports.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from logging import error
from pathlib import Path
from os.path import join, basename
from os import listdir
from update_imports import update_imports

def test_update_imports():
summary = update_imports("plugins/scala_test/tests/resources/input/", "3.2.2", dry_run=True)
assert is_as_expected("plugins/scala_test/tests/resources/", summary)

def is_as_expected(path_to_scenario, output_summary):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if there is a clean way to avoid the duplication between this code and the top level test harness logic. Maybe a shared test utilities library? Not a big deal, but if every plugin will have it's own copy of this code that might be a pain when you need to update something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I wanted to that . I will eventually extract out a commons .
This could moved to commons/test_utilities
I believe that replace_imports could also be a part of commons/scala

expected_output = join(path_to_scenario, "expected")
print("Summary", output_summary)
input_dir = join(path_to_scenario, "input")
for file_name in listdir(expected_output):
with open(join(expected_output, file_name), "r") as f:
file_content = f.read()
expected_content = "".join(file_content.split())

# Search for the file in the output summary
updated_content = [
"".join(o.content.split())
for o in output_summary
if basename(o.path) == file_name
]
print(file_name)
# Check if the file was rewritten
if updated_content:
if expected_content != updated_content[0]:
error("----update" + updated_content[0] )
return False
else:
# The scenario where the file is not expected to be rewritten
original_content= Path(join(input_dir, file_name)).read_text()
if expected_content != "".join(original_content.split()):
return False
return True
17 changes: 17 additions & 0 deletions plugins/scala_test/update_imports.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from recipes import replace_imports


IMPORT_MAPPING = {
"org.scalatest.Matchers": "org.scalatest.matchers.should.Matchers",
"org.scalatest.mock.MockitoSugar": "org.scalatestplus.mockito.MockitoSugar",
# Todo write test scenarios for these
"org.scalatest.FunSuite":"org.scalatest.funsuite.AnyFunSuite",
"org.scalatest.junit.JUnitRunner":"org.scalatestplus.junit.JUnitRunner",
"org.scalatest.FlatSpec": "org.scalatest.flatspec.AnyFlatSpec",
"org.scalatest.junit.AssertionsForJUnit": "org.scalatestplus.junit.AssertionsForJUnit",
}

def update_imports(path_to_codebase: str, scalatest_version,dry_run = False):
if scalatest_version == "3.2.2":
return replace_imports(IMPORT_MAPPING, "scalatest", path_to_codebase, dry_run)
raise Exception(f"Unsupported version: {scalatest_version}")