Skip to content

Commit

Permalink
add note on consistency of results for sys.segments queries (#7034) (#…
Browse files Browse the repository at this point in the history
…7101)

* add doc

* change docs

* PR comments

* few more changes
  • Loading branch information
Surekha authored and fjy committed Feb 20, 2019
1 parent aae0f10 commit 9990e04
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/content/querying/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -571,6 +571,8 @@ The "sys" schema provides visibility into Druid segments, servers and tasks.
### SEGMENTS table
Segments table provides details on all Druid segments, whether they are published yet or not.

#### CAVEAT
Note that a segment can be served by more than one stream ingestion tasks or Historical processes, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple ingestion tasks, until a segment is eventually served by a Historical, at that point the segment is immutable. Broker prefers to query a segment from Historical over an ingestion task. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks because only one of the ingestion tasks is queried by the Broker and it is not gauranteed that the same task gets picked everytime. The `num_rows` column of segments table can have inconsistent values during this period. There is an open [issue](https://github.com/apache/incubator-druid/issues/5915) about this inconsistency with stream ingestion tasks.

|Column|Notes|
|------|-----|
Expand Down

0 comments on commit 9990e04

Please sign in to comment.