Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5100][SQL] add thriftserver-ui support #3946

Closed
wants to merge 7 commits into from

Conversation

tianyi
Copy link
Contributor

@tianyi tianyi commented Jan 8, 2015

In the latest Spark release, there is a Spark Streaming tab on the driver web UI, which shows information about running streaming application. It should be helpful for providing a monitor page in Thrift server because both streaming and Thrift server are long-term applications, and the details of the application do not show on stage page or job page.

Design doc is here: https://issues.apache.org/jira/secure/attachment/12690744/Spark%20Thrift-server%20monitor%20page.pdf

Prototype snapshot: https://issues.apache.org/jira/secure/attachment/12690297/prototype-screenshot.png

@liancheng
Copy link
Contributor

ok to test

@@ -93,6 +94,13 @@ private[hive] class HiveThriftServer2(hiveContext: HiveContext)
extends HiveServer2
with ReflectedCompositeService {

private[hive] val uiTab: Option[ThriftServerTab] =
if (hiveContext.hiveconf.getBoolean("spark.ui.enabled", true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to read Spark configuration via HiveConf although it's valid here. hiveContext.sparkContext.getConf might be better.

@liancheng
Copy link
Contributor

ok to test

@@ -28,6 +28,7 @@ import org.apache.spark.annotation.DeveloperApi
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive.thriftserver.ReflectionUtils._
import org.apache.spark.scheduler.{SparkListenerApplicationEnd, SparkListener}
import org.apache.spark.sql.hive.thriftserver.ui.ThriftServerTab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that imports are not ordered properly in this file...

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25219 has finished for PR 3946 at commit f796a81.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait ThriftServerEventListener
    • class SessionInfo(val session: HiveSession, val startTimestamp: Long)
    • class ExecutionInfo(val statement: String, val session: HiveSession, val startTimestamp: Long)
    • class ThriftServerUIEventListener(val conf: SparkConf)

@@ -184,8 +184,10 @@ private[hive] class SparkExecuteStatementOperation(
}

def run(): Unit = {
val sid = UUID.randomUUID().toString
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can use parentSession.getSessionHandle.getSessionId instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the listener needs a statement id, not a session id, for updating each statement's state.

@liancheng
Copy link
Contributor

@tianyi I understand that because Hive 0.13.1 made incompatible changes to SessionManager.openSession, and you'd like to avoid adding new stuff to the shim layer. However, once a shim is introduced, it's doomed to grow :-) I think we can have a SparkSQLSessionManagerBase in the shim layer, which only handles openSession, and then make SparkSQLSessionManager extend it.

@scwf Would you mind to take a look at this? Particularly the job group related part.


import scala.xml.Node

/** Page for Spark Web UI that shows statistics of a streaming job */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

streaming job?

@tianyi
Copy link
Contributor Author

tianyi commented Jan 16, 2015

I mean I can't match a job and a execution via statement, if there are two same sql running

On Jan 16, 2015, at 11:21, Fei Wang [email protected] wrote:

but it would get wrong when the thrift server is executing two same SQL at the same time.

you mean using two beeline to execute sql(must the same sql?) at same time will get error? what error you get?


Reply to this email directly or view it on GitHub.

@tianyi
Copy link
Contributor Author

tianyi commented Jan 19, 2015

rebased from latest master.

@SparkQA
Copy link

SparkQA commented Jan 19, 2015

Test build #25735 has finished for PR 3946 at commit daed3d1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 19, 2015

Test build #25733 has finished for PR 3946 at commit 14a461d.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 19, 2015

Test build #25736 has finished for PR 3946 at commit fb507df.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

with ReflectedCompositeService {

private lazy val sparkSqlOperationManager = new SparkSQLOperationManager(hiveContext)

override def init(hiveConf: HiveConf) {
setSuperField(this, "hiveConf", hiveConf)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this empty line.

use statementId as groupId to support multiple jobs for one sql.
support multiple jobs for one sql in ThriftServerPage.
@SparkQA
Copy link

SparkQA commented Jan 20, 2015

Test build #25792 has finished for PR 3946 at commit 32332cb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@scwf
Copy link
Contributor

scwf commented Jan 21, 2015

part of group id is look ok to me. Now we create a uuid for each statement, and save it to EventListener, later we can make a follow up PR to add the logical that when user presses ctrl-c, use the group id of the statement to cancel the job. Maybe we need add a onStatementStop interface to EventListener for that.

@liancheng
Copy link
Contributor

Actually the CTRL-C part resides in Beeline. Beeline does implemented a signal handler for this, but the registration code is commented out, don't know why...

@SparkQA
Copy link

SparkQA commented Jan 26, 2015

Test build #26085 has finished for PR 3946 at commit fd35262.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

marmbrus commented Apr 3, 2015

ping @liancheng

@tianyi
Copy link
Contributor Author

tianyi commented Apr 28, 2015

Since there is a big difference between this PR and the latest master branch, so I have created a new PR #5730 for this feature.
So this PR should be closed.

@tianyi tianyi closed this Apr 28, 2015
asfgit pushed a commit that referenced this pull request May 4, 2015
This PR is a rebased version of #3946 , and mainly focused on creating an independent tab for the thrift server in spark web UI.

Features:

1. Session related statistics ( username and IP are only supported in hive-0.13.1 )
2. List all the SQL executing or executed on this server
3. Provide links to the job generated by SQL
4. Provide link to show all SQL executing or executed in a specified session

Prototype snapshots:

This is the main page for thrift server

![image](https://cloud.githubusercontent.com/assets/1411869/7361379/df7dcc64-ed89-11e4-9964-4df0b32f475e.png)

Author: tianyi <[email protected]>

Closes #5730 from tianyi/SPARK-5100 and squashes the following commits:

cfd14c7 [tianyi] style fix
0efe3d5 [tianyi] revert part of pom change
c0f2fa0 [tianyi] extends HiveThriftJdbcTest to start/stop thriftserver for UI test
aa20408 [tianyi] fix style problem
c9df6f9 [tianyi] add testsuite for thriftserver ui and fix some style issue
9830199 [tianyi] add webui for thriftserver
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
This PR is a rebased version of apache#3946 , and mainly focused on creating an independent tab for the thrift server in spark web UI.

Features:

1. Session related statistics ( username and IP are only supported in hive-0.13.1 )
2. List all the SQL executing or executed on this server
3. Provide links to the job generated by SQL
4. Provide link to show all SQL executing or executed in a specified session

Prototype snapshots:

This is the main page for thrift server

![image](https://cloud.githubusercontent.com/assets/1411869/7361379/df7dcc64-ed89-11e4-9964-4df0b32f475e.png)

Author: tianyi <[email protected]>

Closes apache#5730 from tianyi/SPARK-5100 and squashes the following commits:

cfd14c7 [tianyi] style fix
0efe3d5 [tianyi] revert part of pom change
c0f2fa0 [tianyi] extends HiveThriftJdbcTest to start/stop thriftserver for UI test
aa20408 [tianyi] fix style problem
c9df6f9 [tianyi] add testsuite for thriftserver ui and fix some style issue
9830199 [tianyi] add webui for thriftserver
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This PR is a rebased version of apache#3946 , and mainly focused on creating an independent tab for the thrift server in spark web UI.

Features:

1. Session related statistics ( username and IP are only supported in hive-0.13.1 )
2. List all the SQL executing or executed on this server
3. Provide links to the job generated by SQL
4. Provide link to show all SQL executing or executed in a specified session

Prototype snapshots:

This is the main page for thrift server

![image](https://cloud.githubusercontent.com/assets/1411869/7361379/df7dcc64-ed89-11e4-9964-4df0b32f475e.png)

Author: tianyi <[email protected]>

Closes apache#5730 from tianyi/SPARK-5100 and squashes the following commits:

cfd14c7 [tianyi] style fix
0efe3d5 [tianyi] revert part of pom change
c0f2fa0 [tianyi] extends HiveThriftJdbcTest to start/stop thriftserver for UI test
aa20408 [tianyi] fix style problem
c9df6f9 [tianyi] add testsuite for thriftserver ui and fix some style issue
9830199 [tianyi] add webui for thriftserver
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This PR is a rebased version of apache#3946 , and mainly focused on creating an independent tab for the thrift server in spark web UI.

Features:

1. Session related statistics ( username and IP are only supported in hive-0.13.1 )
2. List all the SQL executing or executed on this server
3. Provide links to the job generated by SQL
4. Provide link to show all SQL executing or executed in a specified session

Prototype snapshots:

This is the main page for thrift server

![image](https://cloud.githubusercontent.com/assets/1411869/7361379/df7dcc64-ed89-11e4-9964-4df0b32f475e.png)

Author: tianyi <[email protected]>

Closes apache#5730 from tianyi/SPARK-5100 and squashes the following commits:

cfd14c7 [tianyi] style fix
0efe3d5 [tianyi] revert part of pom change
c0f2fa0 [tianyi] extends HiveThriftJdbcTest to start/stop thriftserver for UI test
aa20408 [tianyi] fix style problem
c9df6f9 [tianyi] add testsuite for thriftserver ui and fix some style issue
9830199 [tianyi] add webui for thriftserver
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants