Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23838][WEBUI] Running SQL query is displayed as "completed" in SQL tab #20955

Closed
wants to merge 2 commits into from

Conversation

gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Apr 1, 2018

What changes were proposed in this pull request?

A running SQL query would appear as completed in the Spark UI:
image1

We can see the query in "Completed queries", while in in the job page we see it's still running Job 132.
image2

After some time in the query still appears in "Completed queries" (while it's still running), but the "Duration" gets increased.
image3

To reproduce, we can run a query with multiple jobs. E.g. Run TPCDS q6.

The reason is that updates from executions are written into kvstore periodically, and the job start event may be missed.

How was this patch tested?

Manually run the job again and check the SQL Tab. The fix is pretty simple.

@SparkQA
Copy link

SparkQA commented Apr 1, 2018

Test build #88790 has finished for PR 20955 at commit b418feb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

gengliangwang commented Apr 1, 2018

@vanzin Please help review this.

@SparkQA
Copy link

SparkQA commented Apr 1, 2018

Test build #88795 has finished for PR 20955 at commit 2357f59.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -39,7 +39,7 @@ private[ui] class AllExecutionsPage(parent: SQLTab) extends WebUIPage("") with L
val failed = new mutable.ArrayBuffer[SQLExecutionUIData]()

sqlStore.executionsList().foreach { e =>
val isRunning = e.jobs.exists { case (_, status) => status == JobExecutionStatus.RUNNING }
val isRunning = e.completionTime.isEmpty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you actually need both checks. The code in 2.2 is this:

      if (executionUIData.completionTime.nonEmpty && !executionUIData.hasRunningJobs) {
        // We are the last job of this execution, so mark the execution as finished. Note that
        // `onExecutionEnd` also does this, but currently that can be called before `onJobEnd`
        // since these are called on different threads.
        markExecutionFinished(executionId)
      }

The original reason why this code is like this is that job events and sql execution events could arrive out of order; I don't know if that is still true, but I tried to maintain the same workarounds in the new code.

If the out-of-order issue exists, then your change would introduce the opposite problem: an execution marked as completed when existing known jobs are still running, because the execution end event arrived before the job end event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. I have updated the code.

@SparkQA
Copy link

SparkQA commented Apr 3, 2018

Test build #88841 has finished for PR 20955 at commit ceb28eb.

  • This patch fails from timeout after a configured wait of `300m`.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Apr 3, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88857 has finished for PR 20955 at commit ceb28eb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vanzin
Copy link
Contributor

vanzin commented Apr 4, 2018

Merging to master / 2.3.

asfgit pushed a commit that referenced this pull request Apr 4, 2018
… SQL tab

## What changes were proposed in this pull request?

A running SQL query would appear as completed in the Spark UI:
![image1](https://user-images.githubusercontent.com/1097932/38170733-3d7cb00c-35bf-11e8-994c-43f2d4fa285d.png)

We can see the query in "Completed queries", while in in the job page we see it's still running Job 132.
![image2](https://user-images.githubusercontent.com/1097932/38170735-48f2c714-35bf-11e8-8a41-6fae23543c46.png)

After some time in the query still appears in "Completed queries" (while it's still running), but the "Duration" gets increased.
![image3](https://user-images.githubusercontent.com/1097932/38170737-50f87ea4-35bf-11e8-8b60-000f6f918964.png)

To reproduce, we can run a query with multiple jobs. E.g. Run TPCDS q6.

The reason is that updates from executions are written into kvstore periodically, and the job start event may be missed.

## How was this patch tested?
Manually run the job again and check the SQL Tab. The fix is pretty simple.

Author: Gengliang Wang <[email protected]>

Closes #20955 from gengliangwang/jobCompleted.

(cherry picked from commit d8379e5)
Signed-off-by: Marcelo Vanzin <[email protected]>
@asfgit asfgit closed this in d8379e5 Apr 4, 2018
mshtelma pushed a commit to mshtelma/spark that referenced this pull request Apr 5, 2018
… SQL tab

## What changes were proposed in this pull request?

A running SQL query would appear as completed in the Spark UI:
![image1](https://user-images.githubusercontent.com/1097932/38170733-3d7cb00c-35bf-11e8-994c-43f2d4fa285d.png)

We can see the query in "Completed queries", while in in the job page we see it's still running Job 132.
![image2](https://user-images.githubusercontent.com/1097932/38170735-48f2c714-35bf-11e8-8a41-6fae23543c46.png)

After some time in the query still appears in "Completed queries" (while it's still running), but the "Duration" gets increased.
![image3](https://user-images.githubusercontent.com/1097932/38170737-50f87ea4-35bf-11e8-8b60-000f6f918964.png)

To reproduce, we can run a query with multiple jobs. E.g. Run TPCDS q6.

The reason is that updates from executions are written into kvstore periodically, and the job start event may be missed.

## How was this patch tested?
Manually run the job again and check the SQL Tab. The fix is pretty simple.

Author: Gengliang Wang <[email protected]>

Closes apache#20955 from gengliangwang/jobCompleted.
robert3005 pushed a commit to palantir/spark that referenced this pull request Apr 7, 2018
… SQL tab

## What changes were proposed in this pull request?

A running SQL query would appear as completed in the Spark UI:
![image1](https://user-images.githubusercontent.com/1097932/38170733-3d7cb00c-35bf-11e8-994c-43f2d4fa285d.png)

We can see the query in "Completed queries", while in in the job page we see it's still running Job 132.
![image2](https://user-images.githubusercontent.com/1097932/38170735-48f2c714-35bf-11e8-8a41-6fae23543c46.png)

After some time in the query still appears in "Completed queries" (while it's still running), but the "Duration" gets increased.
![image3](https://user-images.githubusercontent.com/1097932/38170737-50f87ea4-35bf-11e8-8b60-000f6f918964.png)

To reproduce, we can run a query with multiple jobs. E.g. Run TPCDS q6.

The reason is that updates from executions are written into kvstore periodically, and the job start event may be missed.

## How was this patch tested?
Manually run the job again and check the SQL Tab. The fix is pretty simple.

Author: Gengliang Wang <[email protected]>

Closes apache#20955 from gengliangwang/jobCompleted.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
… SQL tab

A running SQL query would appear as completed in the Spark UI:
![image1](https://user-images.githubusercontent.com/1097932/38170733-3d7cb00c-35bf-11e8-994c-43f2d4fa285d.png)

We can see the query in "Completed queries", while in in the job page we see it's still running Job 132.
![image2](https://user-images.githubusercontent.com/1097932/38170735-48f2c714-35bf-11e8-8a41-6fae23543c46.png)

After some time in the query still appears in "Completed queries" (while it's still running), but the "Duration" gets increased.
![image3](https://user-images.githubusercontent.com/1097932/38170737-50f87ea4-35bf-11e8-8b60-000f6f918964.png)

To reproduce, we can run a query with multiple jobs. E.g. Run TPCDS q6.

The reason is that updates from executions are written into kvstore periodically, and the job start event may be missed.

Manually run the job again and check the SQL Tab. The fix is pretty simple.

Author: Gengliang Wang <[email protected]>

Closes apache#20955 from gengliangwang/jobCompleted.

(cherry picked from commit d8379e5)
Signed-off-by: Marcelo Vanzin <[email protected]>

Change-Id: I779e8d5363150afe3045c774f77f6395972192f8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants