[SPARK-23690][ML] Add handleinvalid to VectorAssembler #20829

yogeshg · 2018-03-15T02:40:08Z

What changes were proposed in this pull request?

Introduce handleInvalid parameter in VectorAssembler that can take in "keep", "skip", "error" options. "error" throws an error on seeing a row containing a null, "skip" filters out all such rows, and "keep" adds relevant number of NaN. "keep" figures out an example to find out what this number of NaN s should be added and throws an error when no such number could be found.

How was this patch tested?

Unit tests are added to check the behavior of assemble on specific rows and the transformer is called on DataFrames of different configurations to test different corner cases.

Vector assembler stuff

SparkQA · 2018-03-15T03:16:43Z

Test build #88249 has finished for PR 20829 at commit c0c0e3d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

WeichenXu123 · 2018-03-15T04:36:51Z

mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala

@@ -234,7 +234,7 @@ class StringIndexerModel (
    val metadata = NominalAttribute.defaultAttr
      .withName($(outputCol)).withValues(filteredLabels).toMetadata()
    // If we are skipping invalid records, filter them out.
-    val (filteredDataset, keepInvalid) = getHandleInvalid match {


Why need change this line ?

Thanks for picking this out! I changed this because I was matching on $(handleInvalid) in VectorAssembler and that seems to be the recommended way of doing this. Should I include this in the current PR and add a note or open a separate PR?

ok. it doesn't matter no need separate PR I think. just a minor change.

For the record, in general, I would not bother making changes like this. The one exception I do make is IntelliJ style complaints since those can be annoying for developers.

SparkQA · 2018-03-15T22:18:52Z

Test build #88277 has finished for PR 20829 at commit bf2f5b3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

yogeshg · 2018-03-15T22:30:58Z

I fixed code paths that failed tests, waiting for @SparkQA . Offline talk with @MrBago suggests that we can perhaps decrease the number of maps in transform method. Looking into that.

SparkQA · 2018-03-15T23:37:59Z

Test build #88278 has finished for PR 20829 at commit 5ce7671.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yogeshg · 2018-03-16T01:44:42Z

test this please

SparkQA · 2018-03-16T02:33:34Z

Test build #88286 has finished for PR 20829 at commit 482225f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-03-16T02:52:48Z

Test build #88287 has finished for PR 20829 at commit 482225f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

WeichenXu123 · 2018-03-16T03:31:01Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+   */
+  @Since("1.6.0")
+  override val handleInvalid: Param[String] = new Param[String](this, "handleInvalid",
+    "Hhow to handle invalid data (NULL values). Options are 'skip' (filter out rows with " +


HHow -> How

WeichenXu123 · 2018-03-16T03:31:37Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+  override val handleInvalid: Param[String] = new Param[String](this, "handleInvalid",
+    "Hhow to handle invalid data (NULL values). Options are 'skip' (filter out rows with " +
+      "invalid data), 'error' (throw an error), or 'keep' (return relevant number of NaN " +
+      "in the * output).", ParamValidators.inArray(VectorAssembler.supportedHandleInvalids))


"in the * output" -> "in the output"

WeichenXu123 · 2018-03-16T03:32:17Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

@@ -49,32 +51,65 @@ class VectorAssembler @Since("1.4.0") (@Since("1.4.0") override val uid: String)
  @Since("1.4.0")
  def setOutputCol(value: String): this.type = set(outputCol, value)

+  /** @group setParam */
+  @Since("1.6.0")


@Since("2.4.0")

WeichenXu123 · 2018-03-16T03:48:12Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+                     |VectorAssembler cannot determine the size of empty vectors. Consider applying
+                     |VectorSizeHint to ${c} so that this transformer can be used to transform empty
+                     |columns.
+               """.stripMargin.replaceAll("\n", " "))


I think in this case, VectorSizeHint also cannot help to providing the vector size.

WeichenXu123 · 2018-03-16T04:09:38Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+    val lengths = featureAttributesMap.map(a => a.length)
+    val metadata = new AttributeGroup($(outputCol), featureAttributes.toArray).toMetadata()
+    val (filteredDataset, keepInvalid) = $(handleInvalid) match {
+      case StringIndexer.SKIP_INVALID => (dataset.na.drop("any", $(inputCols)), false)


you can directly use dataset.na.drop($(inputCols))

Ah, good point! Although I do think that keeping "any" might make it easier to read, but that may not necessarily hold for experienced people :P

WeichenXu123 · 2018-03-16T04:13:26Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+           """.stripMargin.replaceAll("\n", " "))
+          }
+          if (isMissingNumAttrs) {
+            val column = dataset.select(c).na.drop()


The var name column isn't good. colDataset is better.

An optional optimization is one-pass scanning the dataset and count non-null rows for each "missing num attrs" columns.

Good catch! That name was bothering me too :P
@MrBago and I are thinking of another way to do this more efficiently.

ghost · 2018-03-16T16:54:43Z

What A Mess !

…

On Mar 14, 2018 8:17 PM, "UCB AMPLab" ***@***.***> wrote: Merged build finished. Test FAILed. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#20829 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Ad_gynMT5mf8PhL_CV3DvvUzLzcUfzKvks5ted1VgaJpZM4Srd7a> .

…ompanion object

SparkQA · 2018-03-20T00:18:13Z

Test build #88390 has finished for PR 20829 at commit 2b1fd4e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-03-20T00:57:52Z

Test build #88392 has finished for PR 20829 at commit 9624061.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-03-20T02:24:47Z

Test build #88395 has finished for PR 20829 at commit ab91545.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2018-03-20T20:03:38Z

@hootoconnor Please refrain from making non-constructive comments. If you did not intend to leave the comment here, please remove it. Thanks.

jkbradley

Thanks for the PR! I made a pass, mainly looking at tests.

jkbradley · 2018-03-20T20:04:51Z

mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala

@@ -234,7 +234,7 @@ class StringIndexerModel (
    val metadata = NominalAttribute.defaultAttr
      .withName($(outputCol)).withValues(filteredLabels).toMetadata()
    // If we are skipping invalid records, filter them out.
-    val (filteredDataset, keepInvalid) = getHandleInvalid match {


For the record, in general, I would not bother making changes like this. The one exception I do make is IntelliJ style complaints since those can be annoying for developers.

jkbradley · 2018-03-20T20:10:20Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

  @Since("1.6.0")
  override def load(path: String): VectorAssembler = super.load(path)

-  private[feature] def assemble(vv: Any*): Vector = {
+  private[feature] def assemble(lengths: Seq[Int], keepInvalid: Boolean)(vv: Any*): Vector = {


nit: Use Array[Int] for faster access

Also, I'd add doc explaining requirements, especially that this assumes that lengths and vv have the same length.

jkbradley · 2018-03-20T20:16:16Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

    }
  }

  test("assemble should compress vectors") {
    import org.apache.spark.ml.feature.VectorAssembler.assemble
-    val v1 = assemble(0.0, 0.0, 0.0, Vectors.dense(4.0))
+    val v1 = assemble(Seq(1, 1, 1, 4), true)(0.0, 0.0, 0.0, Vectors.dense(4.0))


We probably want this to fail, right? It expects a Vector of length 4 but is given a Vector of length 1.

that's a typo, Thanks for pointing it out! that number is not used in case we do not have nulls, which is why the test passes

jkbradley · 2018-03-20T20:17:40Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+  def setHandleInvalid(value: String): this.type = set(handleInvalid, value)
+
+  /**
+   * Param for how to handle invalid data (NULL values). Options are 'skip' (filter out rows with


It would be good to expand this doc to explain the behavior: how various types of invalid values are treated (null, NaN, incorrect Vector length) and how computationally expensive different options can be.

Behavior of options already included, explanation of column length included here, run time information included in the VectorAssembler class's documentation. Thanks for the suggestion, this is super important!

also, we just deal with nulls here. NaNs and incorrect length vectors are transmitted transparently. Do we need to test for those?

I'd recommend we deal with NaNs now. This PR is already dealing with some NaN cases: Dataset.na.drop handles NaNs in NumericType columns (but not VectorUDT columns).

I'm Ok with postponing incorrect vector lengths until later or doing that now since that work will be more separate.

jkbradley · 2018-03-20T20:19:04Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

@@ -147,4 +149,72 @@ class VectorAssemblerSuite
      .filter(vectorUDF($"features") > 1)
      .count() == 1)
  }
+
+  test("assemble should keep nulls") {


make more explicit: + " when keepInvalid = true"

jkbradley · 2018-03-20T20:35:12Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+      Vectors.dense(Double.NaN, Double.NaN))
+  }
+
+  test("get lengths function") {


This is great that you're testing this carefully, but I recommend we make sure to pass better exceptions to users. E.g., they won't know what to do with a NullPointerException, so we could instead tell them something like: "Column x in the first row of the dataset has a null entry, but VectorAssembler expected a non-null entry. This can be fixed by explicitly specifying the expected size using VectorSizeHint."

Thanks! We do throw some descriptive error here, added more description to it and made assertions in test on those messages.

jkbradley · 2018-03-20T20:36:28Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+  }
+
+  test("Handle Invalid should behave properly") {
+    val df = Seq[(Long, Long, java.lang.Double, Vector, String, Vector, Long)](


Since this is shared across multiple tests, just make it a shared value. See e.g. https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala#L55

Also, if there are "trash" columns not used by VectorAssembler, maybe name them as such and add a few null values in them for better testing.

thanks, good idea! this helped me in catching the drop.na() bug that might drop everything

jkbradley · 2018-03-20T20:39:51Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+
+    // behavior when first row has information
+    assert(assembler.setHandleInvalid("skip").transform(df).count() == 1)
+    intercept[RuntimeException](assembler.setHandleInvalid("keep").transform(df).collect())


Should this fail? I thought it should pad with NaNs.

it fails because vector size hint is not given, adding a section with VectorSizeHInts

jkbradley · 2018-03-20T20:41:40Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+    intercept[RuntimeException](assembler.setHandleInvalid("keep").transform(df).collect())
+    intercept[SparkException](assembler.setHandleInvalid("error").transform(df).collect())
+
+    // numeric column is all null


Did you want to test:

extraction of metadata from the first row (which is what this is testing, I believe), or

transformation on an all-null column (which this never reaches)?

was testing extraction of metadata for numeric column (is always 1). Not relevant in new framework.

jkbradley · 2018-03-20T20:41:57Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+    intercept[RuntimeException](
+      assembler.setHandleInvalid("keep").transform(df.filter("id1==3")).count() == 1)
+
+    // vector column is all null


…and style review wip adding an all null column should not break anything; bugfix review wip update test logic

SparkQA · 2018-03-22T02:11:42Z

Test build #88494 has finished for PR 20829 at commit e7e26f0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-03-22T02:21:41Z

Test build #88495 has finished for PR 20829 at commit 4c99003.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley

Thanks for the updates! I mostly have style comments at this point.

jkbradley · 2018-03-27T17:07:02Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

 import org.apache.spark.ml.param.shared._
 import org.apache.spark.ml.util._
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.types._

 /**
- * A feature transformer that merges multiple columns into a vector column.
+ * A feature transformer that merges multiple columns into a vector column. This requires one pass


style nit: Move new text here into a new paragraph below. That will give nicer "pyramid-style" formatting with essential info separated from details.

jkbradley · 2018-03-27T17:10:03Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+  def setHandleInvalid(value: String): this.type = set(handleInvalid, value)
+
+  /**
+   * Param for how to handle invalid data (NULL values). Options are 'skip' (filter out rows with


I'd recommend we deal with NaNs now. This PR is already dealing with some NaN cases: Dataset.na.drop handles NaNs in NumericType columns (but not VectorUDT columns).

I'm Ok with postponing incorrect vector lengths until later or doing that now since that work will be more separate.

jkbradley · 2018-03-27T17:14:18Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

-    lazy val first = dataset.toDF.first()
-    val attrs = $(inputCols).flatMap { c =>
+
+    val vectorCols = $(inputCols).toSeq.filter { c =>


nit: Is toSeq extraneous?

jkbradley · 2018-03-27T17:20:04Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+    }
+    val vectorColsLengths = VectorAssembler.getLengths(dataset, vectorCols, $(handleInvalid))
+
+    val featureAttributesMap = $(inputCols).toSeq.map { c =>


I think the flatMap is simpler, or at least a more common pattern in Spark and Scala (rather than having nested sequences which are then flattened).

We need the map to find out the length of vectors, unless there's a way to do this in one mapping way, I think it might be better than to call first a map and then a flatMap.

jkbradley · 2018-03-27T17:23:36Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

-          if (group.attributes.isDefined) {
-            // If attributes are defined, copy them with updated names.
-            group.attributes.get.zipWithIndex.map { case (attr, i) =>
+          val attributeGroup = AttributeGroup.fromStructField(field)


for the future, I'd avoid renaming things like this unless it's really unclear or needed (to make diffs shorter)

jkbradley · 2018-03-27T19:38:28Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

  @Since("1.6.0")
  override def load(path: String): VectorAssembler = super.load(path)

-  private[feature] def assemble(vv: Any*): Vector = {
+  /**
+   * Returns a UDF that has the required information to assemble each row.


nit: When people say "UDF," they generally mean a Spark SQL UDF. This is just a function, not a SQL UDF.

jkbradley · 2018-03-27T20:16:54Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

-    val df = Seq(
-      (0, 0.0, Vectors.dense(1.0, 2.0), "a", Vectors.sparse(2, Array(1), Array(3.0)), 10L)
-    ).toDF("id", "x", "y", "name", "z", "n")
+    val df = dfWithNulls.filter("id1 == 1").withColumn("id", col("id1"))


nit: If this is for consolidation, I'm actually against this little change since it obscures what this test is doing and moves the input Row farther from the expected output row.

jkbradley · 2018-03-27T20:21:17Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+      .setInputCols(Array("x", "y", "z", "n"))
+      .setOutputCol("features")
+
+    def run_with_metadata(mode: String, additional_filter: String = "true"): Dataset[_] = {


style: use camelCase

jkbradley · 2018-03-27T20:22:11Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+
+    def run_with_metadata(mode: String, additional_filter: String = "true"): Dataset[_] = {
+      val attributeY = new AttributeGroup("y", 2)
+      val subAttributesOfZ = Array(NumericAttribute.defaultAttr, NumericAttribute.defaultAttr)


jkbradley · 2018-03-27T20:24:26Z

mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala

+      output.collect()
+      output
+    }
+    def run_with_first_row(mode: String): Dataset[_] = {


style: Put empty line between functions

jkbradley

The changes look good, but there are a few unaddressed comments.

jkbradley · 2018-04-02T18:58:18Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+      c -> AttributeGroup.fromStructField(dataset.schema(c)).size
+    }.toMap
+    val missing_columns: Seq[String] = group_sizes.filter(_._2 == -1).keys.toSeq
+    val first_sizes: Map[String, Int] = (missing_columns.nonEmpty, handleInvalid) match {


jkbradley · 2018-04-02T18:58:34Z

mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala

+      case (true, VectorAssembler.SKIP_INVALID) =>
+        getVectorLengthsFromFirstRow(dataset.na.drop(missing_columns), missing_columns)
+      case (true, VectorAssembler.KEEP_INVALID) => throw new RuntimeException(
+        s"""Can not infer column lengths for 'keep invalid' mode. Consider using


SparkQA · 2018-04-02T19:58:26Z

Test build #88829 has finished for PR 20829 at commit 081b5c0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-04-02T22:32:09Z

Test build #88834 has finished for PR 20829 at commit 134bd1e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-04-02T22:55:55Z

Test build #88835 has finished for PR 20829 at commit bf277be.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2018-04-02T23:40:51Z

LGTM
Merging with master
Thanks @yogeshg for the PR and @WeichenXu123 for taking a look!

## What changes were proposed in this pull request? Introduce `handleInvalid` parameter in `VectorAssembler` that can take in `"keep", "skip", "error"` options. "error" throws an error on seeing a row containing a `null`, "skip" filters out all such rows, and "keep" adds relevant number of NaN. "keep" figures out an example to find out what this number of NaN s should be added and throws an error when no such number could be found. ## How was this patch tested? Unit tests are added to check the behavior of `assemble` on specific rows and the transformer is called on `DataFrame`s of different configurations to test different corner cases. Author: Yogesh Garg <yogesh(dot)garg()databricks(dot)com> Author: Bago Amirbekian <[email protected]> Author: Yogesh Garg <[email protected]> Closes apache#20829 from yogeshg/rformula_handleinvalid.

MrBago and others added 12 commits March 14, 2018 15:42

Better error for streaming dataframes, ensure non-null Vectors in first.

1788379

add NaN for null column

c34332d

get lengths with a map

f2f763d

wip

272a806

wip

dc99db8

wip

61fbcc4

Merge branch 'rformula_handleinvalid' into vectorAssemblerStuff

08b8c04

Merge fixes.

8c98d36

Merge pull request #2 from MrBago/vectorAssemblerStuff

cb0faba

Vector assembler stuff

fix issues with this implementation

3c3532c

fix bugs; add tests

d29228c

clean

c0c0e3d

yogeshg changed the title ~~[SPARK-23690] [ML] Add handleinvalid to VectorAssembler~~ [SPARK-23690][ML] Add handleinvalid to VectorAssembler Mar 15, 2018

WeichenXu123 reviewed Mar 15, 2018

View reviewed changes

edge case of all null column to be skipped handled iff no elements found

bf2f5b3

update logic for missing metadata

5ce7671

rafactor logic to simplify code paths

482225f

WeichenXu123 reviewed Mar 16, 2018

View reviewed changes

Yogesh Garg added 3 commits March 16, 2018 17:36

update logic as discussed with Bago

8ee702d

fix test cases, remove some conditionals, move to static methods in c…

2b1fd4e

…ompanion object

use the correct statics, update since

9624061

collect instead of cache

ab91545

jkbradley reviewed Mar 20, 2018

View reviewed changes

Yogesh Garg added 2 commits March 21, 2018 18:05

update test logic; bugfix for na.drop; add more docs, error messages …

e7e26f0

…and style review wip adding an all null column should not break anything; bugfix review wip update test logic

No need to compare with anything; just call assemble()

4c99003

jkbradley reviewed Mar 27, 2018

View reviewed changes

Yogesh Garg added 2 commits April 2, 2018 11:45

handle NaNs too; update tests; style

f5a31a6

hint in comments

081b5c0

jkbradley reviewed Apr 2, 2018

View reviewed changes

Yogesh Garg added 2 commits April 2, 2018 14:21

remaining style changes in review

134bd1e

restore old example

bf277be

asfgit closed this in a135182 Apr 2, 2018

[SPARK-23690][ML] Add handleinvalid to VectorAssembler #20829

[SPARK-23690][ML] Add handleinvalid to VectorAssembler #20829

Conversation

yogeshg commented Mar 15, 2018

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Mar 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Mar 15, 2018

yogeshg commented Mar 15, 2018

SparkQA commented Mar 15, 2018

yogeshg commented Mar 16, 2018

SparkQA commented Mar 16, 2018

SparkQA commented Mar 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Mar 16, 2018 via email

SparkQA commented Mar 20, 2018

SparkQA commented Mar 20, 2018

SparkQA commented Mar 20, 2018

jkbradley commented Mar 20, 2018

jkbradley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Mar 22, 2018

SparkQA commented Mar 22, 2018

jkbradley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkbradley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Apr 2, 2018

SparkQA commented Apr 2, 2018

SparkQA commented Apr 2, 2018

jkbradley commented Apr 2, 2018