Within-vector reductions #4628

abadams · 2020-02-19T23:08:57Z

Various ISAs have vector instructions that perform a full or partial reduction across the vector lanes, and we have had no good way to reach these. Of particular note is the instruction variously referred to as udot/dp4a/vrmpy, which does a 4 x int8_t dot product into a 32-bit result, all potentially within a wider vector. This instruction is important for quantized neural networks.

This PR reaches such ops by composing atomic() with vectorize(). These operations as viewed as atomic (i.e. semantics are serial but unordered) reductions along the vectorized dimension.

I added a new IR node VectorReduce, which reduces groups of adjacent lanes in a wide vector down to a narrower vector using some operator. This is equivalent to a binary operation tree on slices of the vector, but that's much harder to peephole match in backends. The reduction operators are our primitive IR nodes that happen to be associative and commutative (add, mul, min, max, and, or). This node is introduced by the vectorize_loops pass when it would otherwise emit a pattern of the flavor: f(ramp()/4) = f(ramp()/4) + some_vector; or for total reductions f(broadcast()) = f(broadcast()) + some_vector;

Backends can specialize their handling of this node to hit the appropriate ops. It gets a little more complex than that because some of these ops can also incorporate an existing accumulator. Much of the new logic in this PR is in various backends. I added support for x86, arm, and cuda. I did not touch the hexagon backend, as I figured @pranavb-ca or @dsharletg would be better at handling that. For the arm udot/sdot instructions I added codegen that I believe is correct but again have no way to test it (our buildbots don't have these instructions, which need a qualcomm 855 or similar). I added a feature flag to guard the emission of these instructions. Maybe @pranavb-ca or @dsharletg can sanity-check with a quick test on an 855?

For d3d and opencl there are intrinsics for this that should reach the same underlying instruction (dp4a on nvidia cards) but I didn't want to continue to drag on this PR. Those can be addressed later. Also related but not considered here is reductions along the warp via composing atomic() and gpu_lanes().

As a final note, atomic_vectorization is my favorite branch name to date.

abadams · 2020-02-19T23:16:03Z

FYI @alexreinking that cmake presubmit just caught me forgetting to add two tests to the cmakelist, so it was worthwhile.

steven-johnson · 2020-02-20T18:30:37Z

Is this ready to review? (I'm assuming not from the failed buildbots.)

abadams · 2020-02-20T18:41:06Z

Triaging the failed buildbots now. Looks like the main issue is an llvm bug that I'll need to work around: https://bugs.llvm.org/show_bug.cgi?id=44976

But there's also a crash on arm that I need to figure out. Feel free to wait until I get the buildbots green - this isn't urgent.

abadams · 2020-02-20T21:12:24Z

Craig Topper jumped on my bug report and so the llvm bug was just fixed in master. We just need to figure out what to do for earlier llvms. A workaround is probably still valuable if there's a simple one.

steven-johnson · 2020-02-20T21:17:00Z

For now, you could just declare (and enforce) that this feature is only available in LLVM 11+

abadams · 2020-02-20T21:20:58Z

Might be hard to do, because this isn't a peephole optimization that's breaking. It happens for vanilla codegen of a VectorReduce IR node, and these could appear for arbitrary reasons. I think I'll just try padding out the vectors to the next power of two for llvms < 11.

abadams · 2020-02-20T23:09:29Z

Actually this bug triggers for just widening multiplication of non-power-of-two vector sizes:

    ImageParam in1(Int(16), 1), in2(Int(16), 1);
    Func f;
    Var x;
    f(x) = cast<int32_t>(in1(x)) * in2(x);
    f.vectorize(x, 12);
    f.compile_jit(Target("x86-64-linux"));

Yikes. I guess nobody uses those. I'll have to inject a workaround into the Mul visitor

abadams · 2020-02-20T23:12:07Z

FYI @alinas for the just-fixed llvm bug, in case you encounter something similar. It goes away if you turn on the sse4.1 feature flag, so I doubt this ever triggers when building for real systems.

alinas · 2020-02-20T23:52:34Z

Got it, thanks for the heads-up.

steven-johnson · 2020-03-04T18:34:56Z

Something is wonky with this PR -- it's been showing a travis test as in progress for days now. Maybe due to the unresolved conflicts?

abadams · 2020-03-04T21:57:32Z

I'll try a merge with master

abadams · 2020-03-14T20:33:06Z

This is now green and ready for review

steven-johnson · 2020-03-16T00:01:04Z

(I know normally we don't bother rebasing commits, but in this case, maybe consider doing so -- it's spread over 80 or so commits.)

abadams · 2020-03-16T03:52:30Z

Sure. There are a few things I can probably carve off into separate PRs too

BachiLi · 2020-03-16T12:21:23Z

src/CodeGen_LLVM.cpp

@@ -1472,6 +1472,14 @@ Value *CodeGen_LLVM::codegen(const Expr &e) {
    value = nullptr;
    e.accept(this);
    internal_assert(value) << "Codegen of an expr did not produce an llvm value\n";
+
+    // Halide doesn't distinguish between scalars and vectors of size 1.


might be good to expand this comment: the code below is converting a vector of size 1 to a scalar?

BachiLi · 2020-03-16T12:27:55Z

src/CodeGen_LLVM.cpp

+
+#if LLVM_VERSION >= 90
+    if (output_lanes == 1 &&
+        // As of the release of llvm 10, the 64-bit experimental total


should these criteria be moved to "llvm_has_intrinsic"?

BachiLi · 2020-03-16T12:39:05Z

src/VectorizeLoops.cpp

+};  // namespace
+
+class FindVectorizableExprsInAtomicNode : public IRMutator {
+    Scope<> poisoned_names;


I probably know what this means but might be good to comment about the definition of "poisoned".

steven-johnson · 2020-03-16T17:10:21Z

Sure. There are a few things I can probably carve off into separate PRs too

Yeah, with this amount of change, any easy carving will likely make our life simpler later (if we have to go back and debug injections)

steven-johnson

(LGTM pending lots of nits)

Does the C++ backend need attention here? (Even if it's only injecting something that will deliberately fail?)

IMHO we're going to need some more detailed docs/examples (tutorial?) on using this as the details and limitations might not be immediately obvious.

steven-johnson · 2020-03-16T17:17:02Z

src/Bounds.cpp

+            }
+            break;
+        case VectorReduce::Mul:
+            interval = Interval::everything();


Nit: aren't there some cases we could exploit here? e.g., what if the type is float and we know the bounds are [0.0, 1.0]?

steven-johnson · 2020-03-16T17:19:38Z

src/CodeGen_ARM.cpp

+    // lanes and double the bit-width.
+
+    int factor = op->value.type().lanes() / op->type.lanes();
+    if ((op->type.is_int() ||


style nit: this expr is really hard to read. Might be better to pull out as a separate expr, perhaps even like

bool has_reduce = is_int || is_uint || is_float;
if (!case1) has_reduce = false; // reason why
if (!case2) has_reduce = false; // reason why

etc

steven-johnson · 2020-03-16T17:21:58Z

src/CodeGen_ARM.cpp

@@ -1100,6 +1256,10 @@ string CodeGen_ARM::mattrs() const {
            arch_flags = "+sve";
        }

+        if (target.has_feature(Target::ARMDotProd)) {


Does this need to be gated by LLVM_VERSION?

llvm 8 has it, so we're good

steven-johnson · 2020-03-16T17:24:21Z

src/CodeGen_LLVM.cpp

+
+#if LLVM_VERSION >= 90
+    if (output_lanes == 1 &&
+        // As of the release of llvm 10, the 64-bit experimental total


Do they work in trunk

steven-johnson · 2020-03-16T17:25:36Z

src/CodeGen_LLVM.cpp

+
+    if (factor > 2 && ((factor & 1) == 0)) {
+        // Factor the reduce into multiple stages. If we're going to
+        // be widen the type by 4x or more we should also factor the


steven-johnson · 2020-03-16T17:43:08Z

src/VectorizeLoops.cpp

+        // Even if the load is bad, maybe we can lift the index
+        IRMutator::visit(op);
+
+        // TODO: tuples?


this TODO could probably use a bit more meat on the bone

Hrm, that's something I genuinely forgot to finish. Will need to add a test.

steven-johnson · 2020-03-16T17:44:46Z

src/VectorizeLoops.cpp

@@ -1022,6 +1431,7 @@ class VectorizeLoops : public IRMutator {
            // Replace the var with a ramp within the body
            Expr for_var = Variable::make(Int(32), for_loop->name);
            Expr replacement = Ramp::make(for_loop->min, 1, extent->value);
+            stmt = for_loop->body;


...is this a pre-existing bug?

No, I think this additional line is a no-op, because the next line doesn't use stmt and just clobbers it. I'll delete it.

steven-johnson · 2020-03-16T17:46:23Z

src/runtime/arm_cpu_features.cpp

@@ -20,6 +20,8 @@ WEAK CpuFeatures halide_get_cpu_features() {
    //    features.set_available(halide_target_feature_armv7s);
    // }

+    // TODO: add runtime detection for ARMDotProd extension


Open an issue on this

steven-johnson · 2020-03-16T17:47:56Z

test/correctness/cuda_8_bit_dot_product.cpp

+    if (!t.has_feature(Target::CUDACapability61)) {
+        printf("Not running test because no cuda (with compute capability 6.1) is not enabled in the target\n");
+        return 0;
+    }


suggestion: print the target here

steven-johnson · 2020-03-16T17:48:21Z

test/correctness/simd_op_check.cpp

@@ -187,7 +187,7 @@ class SimdOpCheck : public SimdOpCheckTest {

        // SSE 2

-        for (int w = 2; w <= 4; w++) {
+        for (int w : {2, 4}) {


I presume you intend to skip the w=3 case here

Yeah, LLVM does different things with different versions, and it's not a case worth testing anyway

abadams · 2020-03-16T21:56:22Z

Moving to top-level comment: Not finished until I address the TODO: tuples? in VectorizeLoops.cpp

joesavage · 2020-06-02T12:11:22Z

Do we have a story in this pull request for hoisting the total reductions that typically follow these kinds of partial reduction operations outside the inner loop? Today, using this branch I'm finding that Halide code like the following to perform the dot product:

Func dot("dot");
RDom rv(0, num_elems);
dot() += cast(UInt(32), A_u8(rv)) * B_u8(rv);
dot.update().atomic().vectorize(rv, 16);

Produces assembly code like the following for AArch64, where the movi vector initialization and addv horizontal reduction are in the innermost loop (since the VectorReduce target type is a scalar):

ldr	q0, [x12], #16
ldr	q1, [x10], #16
movi	v2.2d, #0x0
subs	x9, x9, #0x1
udot	v2.4s, v0.16b, v1.16b
addv	s0, v2.4s
fmov	w13, s0
add	w8, w13, w8
b.ne	374 <dot+0x374>

Presumably the same is also true when generating Hexagon's vrmpy and the like, producing code that doesn't make the most of the accumulating capabilities of these instructions. Is there some way to get Halide to continually accumulate into a single vector using this atomic().vectorize() paradigm, or is it expected that these capabilities make up a separate unit of work and thus would likely come from a separate pull request? The following sequence that feels like it perhaps should work currently seems to hit an internal error ("Only Serial and Parallel For nodes should survive down to codegen"):

Func dot("dot");
Func partial("partial");
Var x("x"), y("y"), k("k");

// Define the function
const int vec = 16;
RDom rvi(0, vec, "rvi");
RDom rvo(0, num_elems/vec, "rvo");
partial(k) += cast(UInt(32), A_u8(vec * k + rvi)) * B_u8(vec * k + rvi);
dot() += partial(rvo);

// Schedule the function
partial.update().atomic().vectorize(rvi);
dot.update().atomic().vectorize(rvo, 4);

abadams · 2020-06-02T18:07:26Z

The scheduling directive for that is already in master: rfactor. It's a bit tricky to use unfortunately. Here's how you'd use it for a vectorized dot product:

// Define the function
RDom r(0, num_elems);
dot() += cast(UInt(32), A_u8(vec * k + rvi)) * B_u8(vec * k + rvi);

RVar rvo, rvi;
Var k;
const int vec = 16;
dot.update().split(r, rvo, rvi, vec);
// In partial, each value of rvi gets a separate accumulator indexed by the new pure variable k.
Func partial = dot.update().rfactor(rvi, k); 
partial.compute_root().vectorize(k).update().vectorize(k);

joesavage · 2020-06-03T10:52:37Z

Thanks for the pointer towards rfactor. Unfortunately, though, it looks like this particular case might be a bit more complicated. If we try and simply use rfactor on a reduction domain of extent 16 as suggested, this implies that we want to compute over 16 independent partial sums, producing a VectorReduce node with source type uint8x16 and target type uint32x16. Since this reads 16 elements from each input buffer and produces 16 outputs, this isn't really a VectorReduce in the true sense, and thus doesn't generate UDOT.

Ultimately, what we really want here is an rfactor of extent four – each independent accumulation for which contains an inner loop of size four – and then to vectorize over these loops with a factor of 16 to product a VectorReduce with source type uint8x16 and target type uint32x4. I tried to have a stab at doing this with the following:

// Define the function
Func dot("dot");
RDom r(0, num_elems, "r");
dot() += cast(UInt(32), A(r)) * B(r);

// Schedule the function
Var k("k");
RVar rvo("rvo"), rvoo("rvoo"), rvio("rvio"), rvii("rvii"), fused("fused");
dot.update().split(r, rvo, rvii, 4).split(rvo, rvoo, rvio, 4);
Func partial = dot.update().rfactor(rvio, k);
partial.compute_root().vectorize(k);
partial.update().fuse(rvii, k, fused).atomic().vectorize(fused);

Unfortunately, though, while this does produce a UDOT instruction, it seems to be doing some really weird stuff such that it definitely isn't generating the desired result. I need to look into the 'why' in more detail, but I'm seeing lots of weird if statements in the pseudocode for the inner loop as a result of the vectorization on fused for whatever reason.

Happy to move this discussion into a separate issue if that's helpful as I don't want to drown out any actual review comments for the pull request, but I figure that getting this use case nailed down is probably quite important for this patch and I'm curious to see if there's a good answer here.

abadams · 2020-06-03T16:49:31Z

Current state of the branch is a mess, because I discovered a case where atomic nodes were being incorrectly stripped and had to add them back everywhere temporarily, resulting in CAS loops. I temporarily hacked them back off to look at this case.

Everything works out cleanly if you promise num_elems is a multiple of 16. When it's not you get tail cases. One problem is that both of those splits are GuardWithIf, because they're splits of an RVar with unknown extent. You can make the inner one a RoundUp by reassociating the splits:


    Var k("k");
    RVar rvo("rvo"), rvi("rvi"), rvio("rvio"), rvii("rvii"), fused("fused");
    dot.update().split(r, rvo, rvi, 16).split(rvi, rvio, rvii, 4);
    Func partial = dot.update().rfactor(rvio, k);
    partial.compute_root().vectorize(k);
    partial.update().fuse(rvii, k, fused).atomic().vectorize(fused);

That gives you the hot loop:

.LBB0_2:                                // %"for dot_intm.s1.r$x.rvo"
                                        // =>This Inner Loop Header: Depth=1
	ldr	q1, [x10], #16
	ldr	q2, [x11], #16
	subs	x9, x9, #1              // =1
	udot	v0.4s, v1.16b, v2.16b
	b.ne	.LBB0_2

If num_elems is not known to be a multiple of 16, you get tail cases and if statements, and the hot loop becomes:

.LBB0_2:                                // %true_bb
                                        //   in Loop: Header=BB0_4 Depth=1
	lsl	x2, x15, #4
	sub	x3, x2, x11
	sub	x2, x2, x13
	ldr	q0, [sp]
	ldr	q1, [x8, x3]
	ldr	q2, [x10, x2]
	udot	v0.4s, v1.16b, v2.16b
	str	q0, [sp]
.LBB0_3:                                // %after_bb
                                        //   in Loop: Header=BB0_4 Depth=1
	add	x15, x15, #1            // =1
	cmp	x15, x16
	b.eq	.LBB0_37
.LBB0_4:                                // %"for dot_intm.s1.r$x.rvo"
	sxtw	x2, w15
	lsl	x3, x2, #4
	add	w2, w3, #16             // =16
	cmp	w2, w12
	b.le	.LBB0_2

Not sure why LLVM sees the need to spill the accumulator (q0). But in any case we shouldn't have this if statement. I'll see if I can get it to go away.

abadams · 2020-06-03T17:04:23Z

Found the problem. Will be fixed in the final version of the branch. We were generating if conditions of the form likely(likely(foo)), which messed up a bunch of stuff.

abadams · 2020-06-13T02:02:43Z

ok, ready for more review

vksnk · 2020-06-22T21:56:01Z

src/IREquality.cpp

+    const VectorReduce *e = expr.as<VectorReduce>();
+
+    compare_scalar(op->op, e->op);
+    // We've already compared types, so it's enough to compare the value


Is type comparison missing? There is a comment about it, but I only see operator type and value comparisons. Is it somehow implicit?

types get compared before this function is called in compare_expr

vksnk · 2020-06-22T21:59:13Z

src/IRMatch.cpp

+
+    void visit(const VectorReduce *op) override {
+        const VectorReduce *e = expr.as<VectorReduce>();
+        if (result && e && op->op == e->op) {


Similar question about the type here: don't we need to compare number of lanes as well?

I think we need to compare types here. Will fix.

vksnk · 2020-06-22T22:39:10Z

src/IROperator.cpp

@@ -463,6 +463,31 @@ Expr lossless_cast(Type t, Expr e) {
                return Expr();
            }
        }
+
+        if (const VectorReduce *red = e.as<VectorReduce>()) {


Nit: 'red' is confusing name.

vksnk · 2020-06-23T00:16:29Z

test/correctness/tuple_vector_reduce.cpp

+            return -1;
+        }
+
+        if (!checker.atomics) {


Should be if(checker.vector_reduces)

vksnk · 2020-06-23T00:35:54Z

test/correctness/vector_reductions.cpp

+                        Expr rhs = cast(dst_type, in(x * reduce_factor + r));
+                        Expr rhs2 = cast(dst_type, in(x * reduce_factor + r + 32));
+
+                        if (op == 4 || op == 5) {


It would be much more readable to use named values for reduction operator types

These aren't reduction operator types, they're the test cases in the switch statement below. I'll add a comment.

vksnk · 2020-06-23T05:14:17Z

src/IROperator.cpp

+            const int factor = red->value.type().lanes() / red->type.lanes();
+            switch (red->op) {
+            case VectorReduce::Add:
+                if (t.bits() >= 16 && factor < (1 << (t.bits() / 2))) {


I think a short comment would be really helpful here.

vksnk · 2020-06-23T05:19:42Z

src/IRMatch.cpp

@@ -413,6 +423,10 @@ bool equal_helper(const BaseExprNode &a, const BaseExprNode &b) noexcept {
    case IRNodeType::Shuffle:
        return (equal_helper(((const Shuffle &)a).vectors, ((const Shuffle &)b).vectors) &&
                equal_helper(((const Shuffle &)a).indices, ((const Shuffle &)b).indices));
+    case IRNodeType::VectorReduce:
+        return (((const VectorReduce &)a).op == ((const VectorReduce &)b).op &&


Hrm, at this point we already know the types match (it's checked outside equal_helper) but we don't know that the types of value match before recursively calling equal_helper, so I'd better check that

I think the cast handler has the same bug

vksnk · 2020-06-23T17:56:20Z

src/VectorizeLoops.cpp

 };

+// A ramp with the lanes repeated (e.g. <0 0 2 2 4 4 6 6>)
+struct InterleavedRamp {


This looks like a ramp(broadcast(a, repetitions), broadcast(b, repetitions), lanes), could you please add a TODO, so I can update it after merging?

vksnk · 2020-06-23T19:37:12Z

src/VectorizeLoops.cpp

@@ -941,6 +1029,171 @@ class VectorSubs : public IRMutator {
        return Allocate::make(op->name, op->type, op->memory_type, new_extents, op->condition, body, new_expr, op->free_function);
    }

+    Stmt visit(const Atomic *op) override {
+        // Recognize a few special cases that we can handle as within-vector reduction trees.
+        do {


Always wanted to say that goto is more readable :)

I considered it, but having an explicit scope with breaks makes me more confident that RAII things are going to have the lifetime I expect

vksnk · 2020-06-23T20:03:29Z

src/ModulusRemainder.cpp

@@ -489,11 +490,17 @@ void ComputeModulusRemainder::visit(const Let *op) {
 }

 void ComputeModulusRemainder::visit(const Shuffle *op) {
-    // It's possible that scalar expressions are extracting a lane of a vector - don't fail in this case, but stop
+    // It's possible that scalar expressions are extracting a lane of
+    // a vector - don't faiql in this case, but stop


Fail is still misspelled.

abadams · 2020-06-25T17:13:10Z

woo!

BachiLi reviewed Mar 16, 2020

View reviewed changes

steven-johnson reviewed Mar 16, 2020

View reviewed changes

abadams mentioned this pull request Mar 16, 2020

Add support for runtime detection of the arm dotprod extension #4727

Closed

abadams mentioned this pull request Mar 16, 2020

Better simplification of broadcasts #4728

Merged

alexreinking mentioned this pull request May 5, 2020

correctness/atomics.cpp fails with CUDAVectorize engine #4554

Closed

joesavage mentioned this pull request Jun 5, 2020

Support for multi-dim vectorization #4873

Merged

abadams force-pushed the atomic_vectorization branch from 9257ab4 to ad55e00 Compare June 11, 2020 22:50

abadams force-pushed the atomic_vectorization branch from ad55e00 to 0cce2b5 Compare June 11, 2020 22:52

abadams added 3 commits June 12, 2020 19:02

Add VectorReduce IR node

c3734d3

Handle combination of atomic() and vectorize() in lowering

585bdc7

Codegen for VectorReduce IR nodes

f63c800

abadams force-pushed the atomic_vectorization branch from 54fbad6 to f63c800 Compare June 13, 2020 02:02

Merge branch 'master' into atomic_vectorization

0bce4d0

abadams mentioned this pull request Jun 19, 2020

Halide Development Roadmap #5055

Open

Merge remote-tracking branch 'origin/master' into atomic_vectorization

d7aa479

vksnk reviewed Jun 23, 2020

View reviewed changes

Address review comments

51e4c6d

vksnk approved these changes Jun 23, 2020

View reviewed changes

joesavage mentioned this pull request Jun 24, 2020

Code generation for interleaves of broadcasts #5069

Open

Merge remote-tracking branch 'origin/master' into atomic_vectorization

015f650

abadams merged commit 95e8982 into master Jun 25, 2020

abadams deleted the atomic_vectorization branch July 8, 2020 20:44

This was referenced Sep 20, 2020

Disable CudaVectorize atomics test to unbreak windows build #4683

Merged

Re-enable CUDAVectorize tests. #5286

Merged

Within-vector reductions #4628

Within-vector reductions #4628

Conversation

abadams commented Feb 19, 2020

abadams commented Feb 19, 2020

steven-johnson commented Feb 20, 2020

abadams commented Feb 20, 2020

abadams commented Feb 20, 2020

steven-johnson commented Feb 20, 2020

abadams commented Feb 20, 2020

abadams commented Feb 20, 2020 • edited Loading

abadams commented Feb 20, 2020

alinas commented Feb 20, 2020

steven-johnson commented Mar 4, 2020

abadams commented Mar 4, 2020

abadams commented Mar 14, 2020

steven-johnson commented Mar 16, 2020

abadams commented Mar 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steven-johnson commented Mar 16, 2020

steven-johnson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abadams commented Mar 16, 2020

joesavage commented Jun 2, 2020 • edited Loading

abadams commented Jun 2, 2020

joesavage commented Jun 3, 2020

abadams commented Jun 3, 2020 • edited Loading

abadams commented Jun 3, 2020 • edited Loading

abadams commented Jun 13, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abadams commented Jun 25, 2020

abadams commented Feb 20, 2020 •

edited

Loading

joesavage commented Jun 2, 2020 •

edited

Loading

abadams commented Jun 3, 2020 •

edited

Loading

abadams commented Jun 3, 2020 •

edited

Loading