-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hoist reduction loops #104
Conversation
- Faster cse (transform temporaries) - Faster hoister - Faster FindInstances
@@ -466,6 +466,14 @@ def is_const(self): | |||
return not self.rank or all(is_const_dim(r) for r in self.rank) | |||
|
|||
@property | |||
def is_number(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a specific number type if you wish...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a list of changes for COFFEE in mind including a restructuring of base.py
, as well as the addition of new types (eg Number
) and class attributes (eg is_Symbol
, is_Incr
). I'll do this in another PR
@@ -1209,6 +1222,26 @@ def gencode(self, not_scope=False): | |||
return self.children[0].gencode() | |||
|
|||
|
|||
class Rank(tuple): | |||
|
|||
def __contains__(self, val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you happy with the usage that sticks an COFFEE expression into rank
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm "kinda happy", in the sense that this Rank
class acts as a work around, although there's probably a cleaner solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, I do not mind having the interface cleanup (which will require corresponding changes in other components) in a separate PR after this one is merged.
self.symbols = symbols | ||
self.flatten = flatten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is odd, why do you assign a static function onto self
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
work around to circular imports, as I'm importing utils
and utils
is importing some visitors. Not sure where I should move flatten ? the other option is to re-import flatten in the member methods that use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where I should move flatten ?
coffee.base
might work I think, although it feels a bit arbitrary. Or you could drop flatten
altogether and use chain-star instead:
flatten(xs)
=> list(chain(*xs))
or list(chain.from_iterable(xs))
.
If you directly use this in a for
, then you don't even need the list
. Of course, you will need from itertools import chain
.
the other option is to re-import flatten in the member methods that use it
I think that would express the intention (of breaking circular imports) more clearly than assigning to self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now importing it directly in the relevant member, that is indeed better.
I didn't know about chain
; will probably use it all over the place in a next PR
@@ -1,6 +1,6 @@ | |||
[flake8] | |||
ignore = | |||
E501,F403,F405,E226,E265,E731,E402,E266, | |||
E501,F403,F405,E226,E265,E731,E402,E266,F999, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What fails without F999? On my machine flake8 is still happy without this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe because you have a different global setup? I get hundreds of warnings without F999
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a global setup? Are you sure your flake8 is up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flake8 --version 3.2.1 (pyflakes: 1.3.0, mccabe: 0.5.2, pycodestyle: 2.2.0) CPython 2.7.12 on Linux
I think with flake8 you can also have a global config file wherein you state all of the options that you want to be ignored in any project -- but maybe that's not your case
# Construct the sharing graph | ||
nodes, edges = [], [] | ||
for i in summands(self.stmt.rvalue): | ||
symbols = zip(*explore_operator(i))[0] if not isinstance(i, Symbol) else [i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you merge with master, you get Python 3 style zip
from six.moves
, which means this doesn't work. You need to write list(zip(...))[0]
instead of zip(...)[0]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, will fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I fixed most cases on master. This usage of zip, of course, works in Python 2 if you don't have from six.moves import zip
. So I suggest to fix these in the additions of this patch, and I also suggest to add the corresponding six.moves
import (filter, map, range, zip -- if used in the file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
foo(a) --> [] | ||
""" | ||
|
||
handle = zip(*explore_operator(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested fix: cast iterable to list. The iterable is always "true" in a condition, whether it will yield any elements or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Was that way to avoid circular imports
61cc4f6
to
c1cdaae
Compare
When testing against master branches, I get failures in Firedrake tests/regression/test_conditional.py |
The conditional problem seems to be fixed, for example, with this patch: diff --git a/coffee/expander.py b/coffee/expander.py
index 9d41b40..9ece54c 100644
--- a/coffee/expander.py
+++ b/coffee/expander.py
@@ -73,7 +73,7 @@ class Expander(object):
return ([node], self.EXPAND) if self.should_expand(node) \
else ([node], self.GROUP)
- elif isinstance(node, (Div, FunCall)):
+ elif isinstance(node, (Div, Ternary, FunCall)):
# Try to expand /within/ the children, but then return saying "I'm not
# expandable any further"
for n in node.children:
diff --git a/coffee/hoister.py b/coffee/hoister.py
index 52147d4..5da6d9b 100644
--- a/coffee/hoister.py
+++ b/coffee/hoister.py
@@ -83,7 +83,7 @@ class Extractor(object):
elif isinstance(node, (FunCall, Ternary)):
arg_deps = [self._visit(n) for n in node.children]
- dep = tuple(set(flatten([dep for dep, _ in arg_deps])))
+ dep = set(flatten([dep for dep, _ in arg_deps]))
info = self.EXT if all(i == self.EXT for _, i in arg_deps) else self.STOP
return (dep, info)
diff --git a/coffee/rewriter.py b/coffee/rewriter.py
index 289b150..b20e8fa 100644
--- a/coffee/rewriter.py
+++ b/coffee/rewriter.py
@@ -409,7 +409,7 @@ class ExpressionRewriter(object):
if isinstance(node, (Symbol, Div)):
return
- elif isinstance(node, (Sum, Sub, FunCall)):
+ elif isinstance(node, (Sum, Sub, FunCall, Ternary)):
for n in node.children:
_reassociate(n, node)
|
There is another issue against master branches: ...
A[0][0] = A00[0][0];
A[0][1] = A01[0][0];
A[0][2] = A00[0][1];
A[0][3] = A01[0][1];
A[1][0] = A10[0][0];
A[1][1] = A11[0][0];
A[1][2] = A10[0][1];
A[1][3] = A11[0][1];
A[2][0] = A00[1][0];
A[2][1] = A01[1][0];
A[2][2] = A00[1][1];
A[2][3] = A01[1][1];
A[3][0] = A10[1][0];
A[3][1] = A11[1][0];
A[3][2] = A10[1][1];
A[3][3] = A11[1][1];
for (int j = 0; j < 2; j += 1)
{
for (int k = 0; k < 2; k += 1)
{
#pragma coffee expression
A00[j][k] += (t0[j] * ct5[k]) + (t1[j] * ct6[k]);
}
}
for (int j = 0; j < 2; j += 1)
{
for (int k = 0; k < 2; k += 1)
{
#pragma coffee expression
A11[j][k] += (t0[j] * ct12[k]) + (t1[j] * ct13[k]);
}
} Obviously, the |
I hadn't tested with firedrake master yet. The patch looks fine (the Tell me how to reproduce the other bug, which should be an easy fix |
firedrake tests/extrusion/test_interior_facets_extr.py (use master branches) |
These should now be fixed |
Does not actually fix #97, but I don't think it matters any more. We may still close that as wontfix because of firedrakeproject/tsfc@1d7ce21. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now OK to go.
I'm also happy to land this. Is that fine for you if I just hit merge, or do you prefer to go in synch with some other PRs? |
I'm fine with just merging, I was just wondering if @wence- has anything to say. |
This PR introduces support for hoisting reduction loops (eg, sum-factorisation in TSFC kernels), alongside a number of additional contributions:
Also:
fixes #76, fixes #98, closes #99, fixes #100, resolves #101