-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate building and testing on Windows (currently SparkR only) #14859
Conversation
cc @rxin |
Hm, we also had Travis config that isn't used now, to try to add Java style checking. I can see the value in adding Windows testing, but here we have a third CI tool involved. I'm concerned that I for example wouldn't know how to maintain it. I suppose we also need to decide if Windows is even supported? I don't think it is supported for development, certainly. For deployment -- best effort is my understanding, but may not work. If this relies on a bunch of setup to run (including needing a sorta unofficial copy of Hadoop's winutils) then does testing it this way say much about how it works on Windows? |
Ah, I thought Windows is already officially supported assuming from this documentation https://github.com/apache/spark/blob/master/docs/index.md#downloading. BTW, I do understand your concerns but I believe this will make easy to review Window-specific tests. I mean, at least, we can identify Windows-specific problems easily and as you already know, I believe it is hard to review the PRs for Windows-specific problems currently. I wouldn't mind if this should be closed because, at least, I proposed a automated build on Windows here so reviewers can use this after manually (locally) merging this anyway. My personal opinion is, though, to try to use this as it does not affect code-base or other builds. |
Good point, I suppose there is a weak promise there that it runs on Windows. |
Thanks @HyukjinKwon - I for one would at least like the SparkR client to work on Windows as I think there are quite a few R users who are on Windows. @HyukjinKwon Could you add some documentation on how this build works, where we can check its status and what we can do to restart a build etc ? |
It looks like a few other Apache projects use AppVeyor and it looks like there's a designated process for asking INFRA to enable it for a GitHub mirror: https://ci.appveyor.com/project/ApacheSoftwareFoundation/thrift/history For example, Thrift uses it: https://ci.appveyor.com/project/ApacheSoftwareFoundation/thrift One concern of mine is whether the large number of Spark pull requests and commits would overwhelm Apache's AppVeyor account limits if every one of those builds were to be tested with AppVeyor. Another minor concern of mine relates to GitHub build statuses: if AppVeyor reports its statuses via the GitHub commit status API but AMPLab Jenkins continues to report only via comments then the build status indicators on the pull request page would look confusing. It would be cool if there was a way to have this enabled by default for the master branch but have it be opt-in for pull requests. |
Good point on only running this on the master branch. We could even run it periodically (say nightly) instead of on every commit. |
Pop-Location | ||
|
||
# ========================== R | ||
$rVer = "3.3.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try 3.3.1?
Test build #64556 has finished for PR 14859 at commit
|
Thanks all. Then, let me try to write some documentation for..
|
How to set up
How to re-build/stop/check the buildsHow this build worksIdentically with Travis CI. Per-commit. Options to enable/disable the build for branch, PR and commits.
BTW, there is a getting started guide here, https://www.appveyor.com/docs |
To cut it short, my suggestion is,
If it sounds good, I will go ahead and test. But before proceeding, it'd be great if I can hear other opinions. I think it'd be okay just to filter commits via message. WDYT? - @srowen @JoshRosen @shivaram @felixcheung (I am on holiday until tomorrow. So I would like to test this environment quickly) |
Test build #64922 has finished for PR 14859 at commit
|
@HyukjinKwon Given that the build is green after #14960 is this good to go ? Any other comments @dongjoon-hyun @srowen |
Hi, @shivaram and @HyukjinKwon . LGTM. |
Thank you! I also thimk it's good to go. |
- cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')" | ||
|
||
build_script: | ||
- cmd: mvn -DskipTests -Psparkr -Phive -Phive-thriftserver package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we are fixing the hadoop version that we download in install-dependencies.ps1
does it make sense to also pass in the same profile here ? i.e. -Phadoop-2.6
?
LGTM pending the inline comment about hadoop-2.6 |
Thanks. FYI, I re-ran the test here - https://ci.appveyor.com/project/HyukjinKwon/spark/build/87-SPARK-17200-build-profile |
Thanks. I'll merge this once Jenkins passes |
+@steveloughran since this is running his tool |
# Install maven and dependencies | ||
- ps: .\dev\appveyor-install-dependencies.ps1 | ||
# Required package for R unit tests | ||
- cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to include e1071
, survival
for a few more compatibility tests.
(see DESCRIPTION file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have been broken with newer versions of testthat before, not sure if we should fix the version we run with here to match Jenkins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! I will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good point but its actually tricky to specify a version number using install.packages
- FWIW on Jenkins I see
> packageVersion("testthat")
[1] ‘1.0.2’
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true - maybe just print out packageVersion into the log, in case it breaks
Test build #65072 has finished for PR 14859 at commit
|
Test build #65073 has finished for PR 14859 at commit
|
I just added both I re-ran the test here - https://ci.appveyor.com/project/HyukjinKwon/spark/build/88-SPARK-17200-build-profile |
BTW, I already cc the author of winutils in #14859 (comment) :). |
Test build #65083 has finished for PR 14859 at commit
|
retest this please |
Test build #65095 has finished for PR 14859 at commit
|
Alright, I'm merging this into master. Lets see how this goes in the next few days and we can fine tune this with some follow up PRs if necessary. |
What changes were proposed in this pull request?
This PR adds the build automation on Windows with AppVeyor CI tool.
Currently, this only runs the tests for SparkR as we have been having some issues with testing Windows-specific PRs (e.g. #14743 and #13165) and hard time to verify this.
One concern is, this build is dependent on steveloughran/winutils for pre-built Hadoop bin package (who is a Hadoop PMC member).
How was this patch tested?
Manually, https://ci.appveyor.com/project/HyukjinKwon/spark/build/88-SPARK-17200-build-profile
This takes roughly 40 mins.
Some tests are already being failed and this was found in #14743 (comment). Fixed in SPARK-17339