-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address main scalastyle errors - #196 #248
Conversation
Fix some magic numbers & duplicate string runs.
Codecov Report
@@ Coverage Diff @@
## master #248 +/- ##
==========================================
+ Coverage 70.57% 70.94% +0.37%
==========================================
Files 41 41
Lines 982 1005 +23
Branches 179 180 +1
==========================================
+ Hits 693 713 +20
- Misses 232 235 +3
Partials 57 57
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
Couple fixes need to happen, as well as some questions answered.
throw new IllegalArgumentException() | ||
case other => throw other | ||
case other: Any => throw other | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this blank line?
import org.apache.spark.SparkContext | ||
import org.apache.spark.rdd.RDD | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this blank line?
@@ -45,7 +47,6 @@ object ExtractPopularImages { | |||
.reduceByKey((image1, image2) => (image1._1, image1._2, image1._3 + image2._3)) | |||
.map(x=> (x._2._3, x._2._2)) | |||
.takeOrdered(limit)(Ordering[Int].on(x => -x._1)) | |||
res.foreach(x => println(x._1 + "\t" + x._2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being removed? Is it just printing to a stdout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - it was just printing.
Scalastyle rejects all println.
I can restore it and use a // scalastyle:off
if it's necessary, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if @ianmilligan1 can speak to usefulness of a popular images printout here if we already have the RDD of images available (i.e. they can be mapped through and printed out in scripts if needed)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just ran it on both master and your branch, @greebie, and yes - probably that println
was in there for debugging. The saveAsTextFile
output is identical to the printed lines, so we were getting them twice. I think this is a good alteration.
@@ -26,7 +26,7 @@ import java.util | |||
import scala.collection.mutable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to change the name of this file as well, if you're changing it to NERClassifier
. Make sure you do a git mv foo bar
, otherwise you'll lose the history on the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. @ianmilligan1 I think this should be tested in case I broke something. There are no working unit tests for this I do not think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and works well (updated the docs which were outdated, actually).
@@ -22,7 +22,7 @@ import java.nio.charset.StandardCharsets | |||
import org.apache.commons.codec.binary.Hex | |||
|
|||
|
|||
/** Information about an image. e.g. width, height*/ | |||
/** Information about an image. e.g. width, height */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a full stop after height.
@@ -26,6 +28,7 @@ object TweetUtils { | |||
* @param tweet JValue/JSON object containing Twitter API data (JSON) | |||
*/ | |||
implicit class JsonTweet(tweet: JValue) { | |||
val userBranch = "user" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you speak to why userBranch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
userBranch
refers to user as a branch of tweet json. The main error that wants to be fixed here is multiple occurrences of "user." Next commit will just use user
as I agree that's less confusing.
@@ -29,10 +29,10 @@ class ExtractBoilerPipeTextTest extends FunSuite { | |||
<footer>Copyright 2017</footer>""" | |||
var boiler = """Copyright 2017""" | |||
|
|||
test("Collects boilerpip") { | |||
test("Collects boilerpiep") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
Ok, let me test NER and look into images - should be able to do so tomorrow! |
GitHub issue(s):
#196 , also see #212 in discussion below
What does this Pull Request do?
Provides main fixes for scalastyle where the changes do not require major code refactoring.
sparksql.functions._
How should this be tested?
Mostly
mvn scalastyle:check
but it should also pass Travis and tests.Additional Notes:
I have ignored any scalastyle error where I would have to refactor code to make it work.
"Avoid null" errors were the most common. I have created an issue in #212 about how to resolve these (if we indeed wish to). Looking at the code, I think it's possible that resolving these could help us with a few null Pointer errors. But I'd like to address them in a different PR, because it involves different approaches to some conditionals and may require documentation changes.
Interested parties
@ruebot