forked from apache/kudu
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KUDU-1802: Avoid calls to master when using scan tokens
This patch adds new metadata to the scan token to allow it to contain all of the metadata required to construct a KuduTable and open a scanner in the clients. This means the GetTableSchema and GetTableLocations RPC calls to the master are no longer required when using the scan token. New TableMetadataPB, TabletMetadataPB, and authorization token fields were added as optional fields on the token. Additionally a `projected_column_idx` field was added that can be used in place of the `projected_columns`. This significantly reduces the size of the scan token by not duplicating the ColumnSchemaPB that is already in the TableMetadataPB. Adding the table metadata to the scan token is enabled by default given it’s more scalable and performant. However, it can be disabled in rare cases where more resiliency to column renaming is desired. One example where disabling the table metadata is used is the backup job. Future work, tracked by KUDU-3146, should allow for table metadata to be leveraged in those cases as well. This doesn’t avoid the need for a call to the master to get the schema in the case of writing data to Kudu, that work is tracked by KUDU-3135. I expect the TableMetadataPB message would be used there as well. I included the ability to disable this functionality in the kudu-spark integration via `kudu.useDriverMetadata` just in case there are any unforeseen issues or regressions with this feature. I added a test to compare the serialized size of the scan token with and without the table and tablet metadata. The size results for a 100 column table are: no metadata: 2697 Bytes tablet metadata: 2805 tablet, table, and authz metadata: 3258 Change-Id: I88c1b8392de37dd5e8b7bd8b78a21603ff8b1d1b Reviewed-on: http://gerrit.cloudera.org:8080/16031 Reviewed-by: Grant Henke <[email protected]> Tested-by: Grant Henke <[email protected]>
- Loading branch information
1 parent
23f67ae
commit d23ee5d
Showing
19 changed files
with
1,023 additions
and
135 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
266 changes: 231 additions & 35 deletions
266
java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.