Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to forcing before hashJoin on cascading 3 #1777

Conversation

johnynek
Copy link
Collaborator

@cwensel @rubanm @piyushnarang

This seems to fix the vast majority of the issues I see on cascading 3 being able to plan random graphs.

here is a log of an error I found (but could fix by forcing the hashjoin): https://www.dropbox.com/s/v835xo1w3c68sib/84327C.tgz?dl=0 (it is the large example in this PR).

I still see some errors very rarely like:

[info]   Cause: java.lang.IllegalStateException: too many captured primary elements
[info]   at cascading.flow.planner.iso.transformer.RemoveBranchGraphTransformer.transformGraphInPlaceUsing(RemoveBranchGraphTransformer.java:59)
[info]   at cascading.flow.planner.iso.transformer.RecursiveGraphTransformer.transformGraphInPlaceUsingSafe(RecursiveGraphTransformer.java:140)
[info]   at cascading.flow.planner.iso.transformer.RecursiveGraphTransformer.transform(RecursiveGraphTransformer.java:120)
[info]   at cascading.flow.planner.iso.transformer.RecursiveGraphTransformer.transform(RecursiveGraphTransformer.java:74)
[info]   at cascading.flow.planner.rule.RuleTransformer.performTransform(RuleTransformer.java:111)
[info]   at cascading.flow.planner.rule.RuleTransformer.transform(RuleTransformer.java:97)
[info]   at cascading.flow.planner.rule.RuleExec.performTransform(RuleExec.java:416)
[info]   at cascading.flow.planner.rule.RuleExec.performMutation(RuleExec.java:226)
[info]   at cascading.flow.planner.rule.RuleExec.executeRulePhase(RuleExec.java:178)
[info]   at cascading.flow.planner.rule.RuleExec.planPhases(RuleExec.java:125)
[info]   at cascading.flow.planner.rule.RuleExec.exec(RuleExec.java:86)
[info]   at cascading.flow.planner.rule.RuleSetExec.execPlannerFor(RuleSetExec.java:153)
[info]   at cascading.flow.planner.rule.RuleSetExec$3.call(RuleSetExec.java:336)
[info]   at cascading.flow.planner.rule.RuleSetExec$3.call(RuleSetExec.java:328)

My last ditch idea is if cascading throws, rewrite the hashjoin into a cogrouped. That or, we could add a Config option for users to do this when their jobs fail.

Hopefully @cwensel can find a bug and fix this, or clearly describe to us what kind of graphs trigger this so we can avoid them.

@johnynek
Copy link
Collaborator Author

cc @non

@johnynek
Copy link
Collaborator Author

@piyushnarang can you take a quick look at this?

The tests have usually passed for me so far. I think there are still cases where cascading throws. I added a work-around of a config to disable hashjoins in those cases. If we see some, we can commit the graph as a test graph and move forward. This is not ideal, and I hope Chris Wensel can update the docs on what kinds of plans are allowed so we can avoid this, or perhaps fix a bug in the planner.

@johnynek
Copy link
Collaborator Author

Since this is green, I'm just going to merge it into the oscar/merge-develop-cascading3 branch which is itself a PR (into yet another PR. We are getting pretty recursive here).

@johnynek johnynek merged commit 6faeec2 into oscar/merge-develop-cascading3 Jan 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant