-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6862][Streaming][WebUI] Add BatchPage to display details of a batch #5473
Conversation
My test codes: import org.apache.spark._
import org.apache.spark.streaming._
object StreamingApp {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount").set("spark.streaming.concurrentJobs", "3")
val ssc = new StreamingContext(conf, Seconds(10))
val lines = ssc.socketTextStream("localhost", 9999)
val words = lines.flatMap(_.split(" "))
val pairs = words.map(word => {
Thread.sleep(1000)
(word, 1)
})
val wordCounts = pairs.reduceByKey((x: Int, y: Int) => x + y, 2)
wordCounts.foreachRDD {
rdd => rdd.foreach { v =>
println(v)
}
}
wordCounts.foreachRDD { rdd =>
rdd.foreach { v =>
println(v)
}
rdd.foreach { v =>
println(v)
//throw new RuntimeException("Oops!")
}
}
ssc.start()
ssc.awaitTermination()
}
} |
cc @tdas |
Test build #30084 has finished for PR 5473 at commit
|
/** | ||
* The number of recorders received by the receivers in this batch. | ||
*/ | ||
def numRecords: Long = receivedBlockInfo.map { case (_, infos) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on exposing this.
Its looking pretty good! Here are some preliminary comments. I will do a more detailed pass tomorrow during the day.
|
<th>Duration</th> | ||
<th class="sorttable_nosort">Stages: Succeeded/Total</th> | ||
<th class="sorttable_nosort">Tasks (for all stages): Succeeded/Total</th> | ||
<th>Last Error</th> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this Last Error
and not just Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking Last Error
of tasks. But here is Stage. I agree it should be Error
. Fixed.
@zsxwing Any thoughts on this? |
As I explained in
Let me try some css style to fix it. |
Why would we get wrong properties in this case? We are explicitly setting it in the thread that is launching the Spark job, so there is no question of inheriting local properties by child threads, etc. So that issue should not affect this case, isnt it? Or am I missing something? |
Without #5288, SparkListenerJobStart is created like And after |
Please ignore that |
After thinking it carefully, this may not be an issue, since StreamingContext should be stopped in such case. @tdas, do you think if we still need to show Processing time and Total delay in this case? |
Test build #30220 has finished for PR 5473 at commit
|
Test build #30223 has finished for PR 5473 at commit
|
Conflicts: streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala
Test build #30330 has finished for PR 5473 at commit
|
@tdas this is ready for code review. |
Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
Test build #30585 has finished for PR 5473 at commit
|
In addition, because of |
High level points as discussed offline. We are planning to increase the number of retainedBatches, hence we need to be careful about all the data that we retain in memory. And its not a good idea to retain BatchInfo objects in memory because it contains a lot of arbitrary |
Test build #31237 has finished for PR 5473 at commit
|
private val runningBatchInfos = new HashMap[Time, BatchInfo] | ||
private val completedBatchInfos = new Queue[BatchInfo] | ||
private val batchInfoLimit = ssc.conf.getInt("spark.streaming.ui.retainedBatches", 100) | ||
private val waitingBatchUIDatas = new HashMap[Time, BatchUIData] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datas
sounds really weird. Just leave it as BatchUIData
. Or if you want to follow other hashmaps naming styles timeToWaitingBatchUIData
. Either is fine by me.
Test build #31284 has finished for PR 5473 at commit
|
Test build #31287 has finished for PR 5473 at commit
|
I have merged #5288. Let me take another look at this PR. |
retest this please. |
rest this please. |
Test build #31323 has finished for PR 5473 at commit
|
retest this please. |
Test build #733 has started for PR 5473 at commit |
retest this please. |
Test build #31346 has finished for PR 5473 at commit
|
The test has passed. Merging this! Thanks @zsxwing ! |
…a batch This is an initial commit for SPARK-6862. Once SPARK-6796 is merged, I will add the links to StreamingPage so that the user can jump to BatchPage. Screenshots:   Author: zsxwing <[email protected]> Closes apache#5473 from zsxwing/SPARK-6862 and squashes the following commits: 0727d35 [zsxwing] Change BatchUIData to a case class b380cfb [zsxwing] Add createJobStart to eliminate duplicate codes 9a3083d [zsxwing] Rename XxxDatas -> XxxData 087ba98 [zsxwing] Refactor BatchInfo to store only necessary fields cb62e4f [zsxwing] Use Seq[(OutputOpId, SparkJobId)] to store the id relations 72f8e7e [zsxwing] Add unit tests for BatchPage 1282b10 [zsxwing] Handle some corner cases and add tests for StreamingJobProgressListener 77a69ae [zsxwing] Refactor codes as per TD's comments 35ffd80 [zsxwing] Merge branch 'master' into SPARK-6862 15bdf9b [zsxwing] Add batch links and unit tests 4bf66b6 [zsxwing] Merge branch 'master' into SPARK-6862 7168807 [zsxwing] Limit the max width of the error message and fix nits in the UI 0b226f9 [zsxwing] Change 'Last Error' to 'Error' fc98a43 [zsxwing] Put clearing local properties to finally and remove redundant private[streaming] 0c7b2eb [zsxwing] Add BatchPage to display details of a batch
…a batch This is an initial commit for SPARK-6862. Once SPARK-6796 is merged, I will add the links to StreamingPage so that the user can jump to BatchPage. Screenshots:   Author: zsxwing <[email protected]> Closes apache#5473 from zsxwing/SPARK-6862 and squashes the following commits: 0727d35 [zsxwing] Change BatchUIData to a case class b380cfb [zsxwing] Add createJobStart to eliminate duplicate codes 9a3083d [zsxwing] Rename XxxDatas -> XxxData 087ba98 [zsxwing] Refactor BatchInfo to store only necessary fields cb62e4f [zsxwing] Use Seq[(OutputOpId, SparkJobId)] to store the id relations 72f8e7e [zsxwing] Add unit tests for BatchPage 1282b10 [zsxwing] Handle some corner cases and add tests for StreamingJobProgressListener 77a69ae [zsxwing] Refactor codes as per TD's comments 35ffd80 [zsxwing] Merge branch 'master' into SPARK-6862 15bdf9b [zsxwing] Add batch links and unit tests 4bf66b6 [zsxwing] Merge branch 'master' into SPARK-6862 7168807 [zsxwing] Limit the max width of the error message and fix nits in the UI 0b226f9 [zsxwing] Change 'Last Error' to 'Error' fc98a43 [zsxwing] Put clearing local properties to finally and remove redundant private[streaming] 0c7b2eb [zsxwing] Add BatchPage to display details of a batch
…a batch This is an initial commit for SPARK-6862. Once SPARK-6796 is merged, I will add the links to StreamingPage so that the user can jump to BatchPage. Screenshots:   Author: zsxwing <[email protected]> Closes apache#5473 from zsxwing/SPARK-6862 and squashes the following commits: 0727d35 [zsxwing] Change BatchUIData to a case class b380cfb [zsxwing] Add createJobStart to eliminate duplicate codes 9a3083d [zsxwing] Rename XxxDatas -> XxxData 087ba98 [zsxwing] Refactor BatchInfo to store only necessary fields cb62e4f [zsxwing] Use Seq[(OutputOpId, SparkJobId)] to store the id relations 72f8e7e [zsxwing] Add unit tests for BatchPage 1282b10 [zsxwing] Handle some corner cases and add tests for StreamingJobProgressListener 77a69ae [zsxwing] Refactor codes as per TD's comments 35ffd80 [zsxwing] Merge branch 'master' into SPARK-6862 15bdf9b [zsxwing] Add batch links and unit tests 4bf66b6 [zsxwing] Merge branch 'master' into SPARK-6862 7168807 [zsxwing] Limit the max width of the error message and fix nits in the UI 0b226f9 [zsxwing] Change 'Last Error' to 'Error' fc98a43 [zsxwing] Put clearing local properties to finally and remove redundant private[streaming] 0c7b2eb [zsxwing] Add BatchPage to display details of a batch
This is an initial commit for SPARK-6862. Once SPARK-6796 is merged, I will add the links to StreamingPage so that the user can jump to BatchPage.
Screenshots:

