Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable Target/Address cycle detection in v2 #4358

Closed
peiyuwang opened this issue Mar 21, 2017 · 1 comment · Fixed by #5605
Closed

Re-enable Target/Address cycle detection in v2 #4358

peiyuwang opened this issue Mar 21, 2017 · 1 comment · Fixed by #5605

Comments

@peiyuwang
Copy link
Contributor

While #4349 improves performance, cycle detection is effectively lost due to the new SelectTransitive does not keep track of links, we could either add the cycle detection back within ST that would require reconstructing the graph and redo the work already implemented in Graph, or figure out sending the link info to Graph and reuse the existing cycle detection.

peiyuwang added a commit that referenced this issue Mar 22, 2017
…esses within SelectTransitive (#4349)

### Problem

As described in #4283, the issue with recursively executing `SelectTransitive` is lots of calls to `store_list` creating massive intermediate lists that increases memory footprint as well as impacts perf.

### Solution

Same idea as in #4265 using `loop_fn` rewrite `SelectTransitive` to execute iteratively. One small difference is in order to use `OrderMap` and `HashSet` for state tracking and outputs are in `Value`s, those `Values` are turned into `Key`s.

ps: previous PR #4334 paved way for this optimization by splitting `SelectTransitive` from `SelectDependencies(...transitive=true)`.

### Result

All existing tests pass. Have verified there is only one `ST` node through viz.

Two follow up items
* #4358 reenable cycle detection
* #3925 Remove BFA now not only is a clean up task, but has performance benefits because the second round hydration is on `Address` which is redundant.
@peiyuwang
Copy link
Contributor Author

Note master is BuildGraphIntegrationTest .test_cycle broken now because #4349 was only tested in v1, and cc9563c that was merged at the same time turned v2 on.

@peiyuwang peiyuwang self-assigned this Mar 22, 2017
peiyuwang added a commit that referenced this issue Mar 22, 2017
### Problem

We know #4349 is going to miss cycle detection, we already created #4358 to follow up. The issue is we didn't realize we have an integration test for it, so #4349 was checked in around the same time #4340, the former was only tested in v1, and the latter turned v2 on. Now master is broken.

### Solution

Skip the failing test

### Result

Will follow up with #4358 next for the proper fix.
lenucksi pushed a commit to lenucksi/pants that referenced this issue Apr 25, 2017
…esses within SelectTransitive (pantsbuild#4349)

### Problem

As described in pantsbuild#4283, the issue with recursively executing `SelectTransitive` is lots of calls to `store_list` creating massive intermediate lists that increases memory footprint as well as impacts perf.

### Solution

Same idea as in pantsbuild#4265 using `loop_fn` rewrite `SelectTransitive` to execute iteratively. One small difference is in order to use `OrderMap` and `HashSet` for state tracking and outputs are in `Value`s, those `Values` are turned into `Key`s.

ps: previous PR pantsbuild#4334 paved way for this optimization by splitting `SelectTransitive` from `SelectDependencies(...transitive=true)`.

### Result

All existing tests pass. Have verified there is only one `ST` node through viz.

Two follow up items
* pantsbuild#4358 reenable cycle detection
* pantsbuild#3925 Remove BFA now not only is a clean up task, but has performance benefits because the second round hydration is on `Address` which is redundant.
lenucksi pushed a commit to lenucksi/pants that referenced this issue Apr 25, 2017
### Problem

We know pantsbuild#4349 is going to miss cycle detection, we already created pantsbuild#4358 to follow up. The issue is we didn't realize we have an integration test for it, so pantsbuild#4349 was checked in around the same time pantsbuild#4340, the former was only tested in v1, and the latter turned v2 on. Now master is broken.

### Solution

Skip the failing test

### Result

Will follow up with pantsbuild#4358 next for the proper fix.
@stuhood stuhood changed the title Re-enable cycle detection in v2 Re-enable Target/Address cycle detection in v2 Oct 4, 2017
stuhood pushed a commit that referenced this issue Mar 19, 2018
### Problem

While chasing performance in one area for #4349 we added `SelectTransitive`. This decreased performance in another significant area: executions involving multiple transitive roots can't structure-share at all when expanding dependencies, leading to many redundant walks (#4533). 

Additionally, usability regressed in a few ways: the product `Graph` could not implement cycle detection for dependencies expanded via `SelectTransitive` (#4358), and in the case of a missing or broken dependency, the referring context was lost (#4515). 

### Solution

* Remove usage of `SelectTransitive` for `TransitiveHydratedTargets` to reintroduce structure sharing and improve usability (fixes #4358 and fixes #4515).
* Replace `@rule`s that operate on single-`Spec`s with batch forms that operate on a `Specs` collection (fixes #4533).

### Result

Cycles should be detected much earlier with:
```
Exception message: Build graph construction failed: ExecutionError Received unexpected Throw state(s):
Computing Select(Collection(dependencies=(SingleAddress(directory=u'testprojects/src/java/org/pantsbuild/testproject/cycle1', name=u'cycle1'),)), =TransitiveHydratedTargets)
  Computing Task(transitive_hydrated_targets, Collection(dependencies=(SingleAddress(directory=u'testprojects/src/java/org/pantsbuild/testproject/cycle1', name=u'cycle1'),)), =TransitiveHydratedTargets)
    Computing Task(transitive_hydrated_target, testprojects/src/java/org/pantsbuild/testproject/cycle1:cycle1, =TransitiveHydratedTarget)
      Computing Task(transitive_hydrated_target, testprojects/src/java/org/pantsbuild/testproject/cycle2:cycle2, =TransitiveHydratedTarget)
        Throw(No source of required dependency: Dep graph contained a cycle.)
          Traceback (no traceback):
            <pants native internals>
          Exception: No source of required dependency: Dep graph contained a cycle.
```
_(more polish needed here: re-opened #3695 to clean up `trace`)_

And time taken is proportional to the total number of matched targets, rather than to the sum of the closure sizes of the roots:
```
# before:
$ time ./pants --target-spec-file=targets.txt list | wc -l
    1500

real	0m15.297s
user	0m14.603s
sys	0m1.625s

# after:
$ time ./pants --target-spec-file=targets.txt list | wc -l
    1500

real	0m5.989s
user	0m5.261s
sys	0m1.310s
```

Runtimes for things like `./pants list ::` are unaffected, although memory usage increases by about 5%, likely due to the branchier resulting `Graph`.
stuhood pushed a commit that referenced this issue Mar 20, 2018
While chasing performance in one area for #4349 we added `SelectTransitive`. This decreased performance in another significant area: executions involving multiple transitive roots can't structure-share at all when expanding dependencies, leading to many redundant walks (#4533).

Additionally, usability regressed in a few ways: the product `Graph` could not implement cycle detection for dependencies expanded via `SelectTransitive` (#4358), and in the case of a missing or broken dependency, the referring context was lost (#4515).

* Remove usage of `SelectTransitive` for `TransitiveHydratedTargets` to reintroduce structure sharing and improve usability (fixes #4358 and fixes #4515).
* Replace `@rule`s that operate on single-`Spec`s with batch forms that operate on a `Specs` collection (fixes #4533).

Cycles should be detected much earlier with:
```
Exception message: Build graph construction failed: ExecutionError Received unexpected Throw state(s):
Computing Select(Collection(dependencies=(SingleAddress(directory=u'testprojects/src/java/org/pantsbuild/testproject/cycle1', name=u'cycle1'),)), =TransitiveHydratedTargets)
  Computing Task(transitive_hydrated_targets, Collection(dependencies=(SingleAddress(directory=u'testprojects/src/java/org/pantsbuild/testproject/cycle1', name=u'cycle1'),)), =TransitiveHydratedTargets)
    Computing Task(transitive_hydrated_target, testprojects/src/java/org/pantsbuild/testproject/cycle1:cycle1, =TransitiveHydratedTarget)
      Computing Task(transitive_hydrated_target, testprojects/src/java/org/pantsbuild/testproject/cycle2:cycle2, =TransitiveHydratedTarget)
        Throw(No source of required dependency: Dep graph contained a cycle.)
          Traceback (no traceback):
            <pants native internals>
          Exception: No source of required dependency: Dep graph contained a cycle.
```
_(more polish needed here: re-opened #3695 to clean up `trace`)_

And time taken is proportional to the total number of matched targets, rather than to the sum of the closure sizes of the roots:
```
$ time ./pants --target-spec-file=targets.txt list | wc -l
    1500

real	0m15.297s
user	0m14.603s
sys	0m1.625s

$ time ./pants --target-spec-file=targets.txt list | wc -l
    1500

real	0m5.989s
user	0m5.261s
sys	0m1.310s
```

Runtimes for things like `./pants list ::` are unaffected, although memory usage increases by about 5%, likely due to the branchier resulting `Graph`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant