Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14204] [SQL] register driverClass rather than user-specified class #12000

Closed
wants to merge 1 commit into from

Conversation

mchalek
Copy link
Contributor

@mchalek mchalek commented Mar 28, 2016

This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an IllegalStateException. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204

My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.

This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.

@mchalek mchalek changed the title [SPARK-14204] [SQL] register driverClass rather than user class [SPARK-14204] [SQL] register driverClass rather than user-specified class Mar 28, 2016
@holdenk
Copy link
Contributor

holdenk commented Mar 29, 2016

That seems like a pretty reasonable change, although it might be good to add a regression test. Maybe @davies or @JoshRosen can verify this patch for Jenkins?

@mchalek
Copy link
Contributor Author

mchalek commented Mar 29, 2016

Thanks @holdenk I agree about the value of a regression test for this. I'm not familiar with how Spark's regression testing works. Is there a mechanism through which to run tests that emulate cluster-mode spark, with isolated master/executor JVMs, as well as isolated primordial and task JVMs for the executor? I think it may be necessary to have all 3 JVMs isolated in order to test this issue.

@holdenk
Copy link
Contributor

holdenk commented Mar 29, 2016

There are some integration tests hosted in an separate repo by the Databricks people at https://github.com/databricks/spark-integration-tests , I'm not super sure on the policy for contributing to that but maybe one the Databricks people can chime in (or we could just leave doing that as a follow up activity).

@mchalek
Copy link
Contributor Author

mchalek commented Apr 5, 2016

Bump. Would be nice to get closure on this. I doubt that we were alone in being affected by this (although admittedly, complaints of other failures seem not to have yet surfaced).

@srowen
Copy link
Member

srowen commented May 4, 2016

@JoshRosen do you have a thought on this since you made the change?

@zzcclp
Copy link
Contributor

zzcclp commented May 17, 2016

@mchalek @JoshRosen Is there any progress yet? I got the same error with spark-1.6.1. But if I add "driver=com.mysql.jdbc.Driver" property to write.jdbc's properties params, i was ok.
This pr looks like a pretty reasonable change without setting "driver=com.mysql.jdbc.Driver" property.

@JoshRosen
Copy link
Contributor

I'm on vacation right now. This patch seems pretty small and reasonable to me, though, so it would be great if someone else could take a final look and merge if appropriate.

@zzcclp
Copy link
Contributor

zzcclp commented May 26, 2016

@JoshRosen , could you merge this into branch-1.6?

@zzcclp
Copy link
Contributor

zzcclp commented May 30, 2016

ping @JoshRosen

@zzcclp
Copy link
Contributor

zzcclp commented Jun 1, 2016

@srowen @JoshRosen @holdenk could you take a look and merge this pr?

@srowen
Copy link
Member

srowen commented Jun 2, 2016

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Jun 2, 2016

Test build #59804 has finished for PR 12000 at commit a927c76.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 2, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jun 2, 2016

Test build #59806 has finished for PR 12000 at commit a927c76.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jun 2, 2016
This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user).  The consequence of this is that all JDBC accesses under the described circumstances fail with an `IllegalStateException`. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204

My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.

This patch was tested manually.  I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.

Author: Kevin McHale <[email protected]>

Closes #12000 from mchalek/jdbc-driver-registration.
@srowen
Copy link
Member

srowen commented Jun 2, 2016

Merged to 1.6 per Josh

@mchalek
Copy link
Contributor Author

mchalek commented Jun 2, 2016

Cool, thanks for pushing on this @zzcclp and for merging it @srowen. Any idea when 1.6.2 will be released?

zzcclp pushed a commit to zzcclp/spark that referenced this pull request Jun 3, 2016
This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user).  The consequence of this is that all JDBC accesses under the described circumstances fail with an `IllegalStateException`. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204

My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.

This patch was tested manually.  I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.

Author: Kevin McHale <[email protected]>

Closes apache#12000 from mchalek/jdbc-driver-registration.

(cherry picked from commit 0a13e4c)
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@zzcclp
Copy link
Contributor

zzcclp commented Jul 27, 2016

@srowen @JoshRosen does this pr need to be merged into master or branch-2.0 ?

@srowen
Copy link
Member

srowen commented Jul 27, 2016

I had the impression this was specific to the 1.x code, and not an issue in 2.x?

@mchalek
Copy link
Contributor Author

mchalek commented Jul 27, 2016

The problem is with commit 7f37c1e, so if that commit is on master then this needs to be merged into master as well. Taking a quick look at the affected file, it does appear that the incorrect code is in master.

@mchalek
Copy link
Contributor Author

mchalek commented Jul 27, 2016

Ah, the hash for the relevant commit on master is 6c83d93

@zzcclp
Copy link
Contributor

zzcclp commented Jul 29, 2016

Hi, @srowen @JoshRosen @mchalek , this issue still exists on branch-2.0/master, if I add "driver=com.mysql.jdbc.Driver" property to write.jdbc's properties params, it was ok, otherwise failed.

@srowen
Copy link
Member

srowen commented Jul 29, 2016

OK open a new PR for master

@zzcclp
Copy link
Contributor

zzcclp commented Jul 30, 2016

@srowen @mchalek could you open a new pr for master/branch-2.0?

@zzcclp
Copy link
Contributor

zzcclp commented Jul 30, 2016

thanks, @mchalek

asfgit pushed a commit that referenced this pull request Aug 3, 2016
This is a pull request that was originally merged against branch-1.6 as #12000, now being merged into master as well.  srowen zzcclp JoshRosen

This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an IllegalStateException. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204

My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.

This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.

Author: Kevin McHale <[email protected]>

Closes #14420 from mchalek/mchalek-jdbc_driver_registration.

(cherry picked from commit 685b08e)
Signed-off-by: Sean Owen <[email protected]>
@srowen
Copy link
Member

srowen commented Aug 3, 2016

PS @mchalek can you manually close this PR? it's merged but can't be auto closed

asfgit pushed a commit that referenced this pull request Aug 3, 2016
This is a pull request that was originally merged against branch-1.6 as #12000, now being merged into master as well.  srowen zzcclp JoshRosen

This pull request fixes an issue in which cluster-mode executors fail to properly register a JDBC driver when the driver is provided in a jar by the user, but the driver class name is derived from a JDBC URL (rather than specified by the user). The consequence of this is that all JDBC accesses under the described circumstances fail with an IllegalStateException. I reported the issue here: https://issues.apache.org/jira/browse/SPARK-14204

My proposed solution is to have the executors register the JDBC driver class under all circumstances, not only when the driver is specified by the user.

This patch was tested manually. I built an assembly jar, deployed it to a cluster, and confirmed that the problem was fixed.

Author: Kevin McHale <[email protected]>

Closes #14420 from mchalek/mchalek-jdbc_driver_registration.
@asfgit asfgit closed this in 53e766c Aug 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants