-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-14204] [SQL] register driverClass rather than user-specified class #12000
Conversation
That seems like a pretty reasonable change, although it might be good to add a regression test. Maybe @davies or @JoshRosen can verify this patch for Jenkins? |
Thanks @holdenk I agree about the value of a regression test for this. I'm not familiar with how Spark's regression testing works. Is there a mechanism through which to run tests that emulate cluster-mode spark, with isolated master/executor JVMs, as well as isolated primordial and task JVMs for the executor? I think it may be necessary to have all 3 JVMs isolated in order to test this issue. |
There are some integration tests hosted in an separate repo by the Databricks people at https://github.com/databricks/spark-integration-tests , I'm not super sure on the policy for contributing to that but maybe one the Databricks people can chime in (or we could just leave doing that as a follow up activity). |
Bump. Would be nice to get closure on this. I doubt that we were alone in being affected by this (although admittedly, complaints of other failures seem not to have yet surfaced). |
@JoshRosen do you have a thought on this since you made the change? |
@mchalek @JoshRosen Is there any progress yet? I got the same error with spark-1.6.1. But if I add "driver=com.mysql.jdbc.Driver" property to write.jdbc's properties params, i was ok. |
I'm on vacation right now. This patch seems pretty small and reasonable to me, though, so it would be great if someone else could take a final look and merge if appropriate. |
@JoshRosen , could you merge this into branch-1.6? |
ping @JoshRosen |
@srowen @JoshRosen @holdenk could you take a look and merge this pr? |
Jenkins test this please |
Test build #59804 has finished for PR 12000 at commit
|
Jenkins retest this please |
Test build #59806 has finished for PR 12000 at commit
|
This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an `IllegalStateException`. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204 My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user. This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed. Author: Kevin McHale <[email protected]> Closes #12000 from mchalek/jdbc-driver-registration.
Merged to 1.6 per Josh |
This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an `IllegalStateException`. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204 My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user. This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed. Author: Kevin McHale <[email protected]> Closes apache#12000 from mchalek/jdbc-driver-registration. (cherry picked from commit 0a13e4c)
Can one of the admins verify this patch? |
@srowen @JoshRosen does this pr need to be merged into master or branch-2.0 ? |
I had the impression this was specific to the 1.x code, and not an issue in 2.x? |
The problem is with commit 7f37c1e, so if that commit is on master then this needs to be merged into master as well. Taking a quick look at the affected file, it does appear that the incorrect code is in master. |
Ah, the hash for the relevant commit on master is 6c83d93 |
Hi, @srowen @JoshRosen @mchalek , this issue still exists on branch-2.0/master, if I add "driver=com.mysql.jdbc.Driver" property to write.jdbc's properties params, it was ok, otherwise failed. |
OK open a new PR for master |
thanks, @mchalek |
This is a pull request that was originally merged against branch-1.6 as #12000, now being merged into master as well. srowen zzcclp JoshRosen This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an IllegalStateException. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204 My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user. This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed. Author: Kevin McHale <[email protected]> Closes #14420 from mchalek/mchalek-jdbc_driver_registration. (cherry picked from commit 685b08e) Signed-off-by: Sean Owen <[email protected]>
PS @mchalek can you manually close this PR? it's merged but can't be auto closed |
This is a pull request that was originally merged against branch-1.6 as #12000, now being merged into master as well. srowen zzcclp JoshRosen This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an IllegalStateException. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204 My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user. This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed. Author: Kevin McHale <[email protected]> Closes #14420 from mchalek/mchalek-jdbc_driver_registration.
This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an
IllegalStateException
. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.
This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.