-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MetaCache can issue excessive RPCs when looking for co-located table tablet by key #7413
Labels
kind/bug
This issue is a bug
Comments
ttyusupov
added a commit
that referenced
this issue
Mar 5, 2021
…itions list for co-located tables Summary: As a result of the fix for #6890 there was introduced a potential perf issue for the first lookup of tablet by key for colocated tables. Instead of sending 1 RPC when doing first lookup for colocated table and then reusing the result for all tables co-located with the first one, MetaCache is sending 1 more RPC each time another table co-located with the first one is queried to resolve tablet by key. Since all colocated tables share the same tablet, we can cache the locations on the first RPC to any co-located table and then reuse the result for any MetaCache::LookupTabletByKey calls for any other table co-located with the one already queried. Suppose we have colocated tables `Table1` and `Table2` sharing `Tablet0`, then behavior without the fix is the following: 1. Someone calls `MetaCache::LookupTabletByKey` for `Table1` and `partition_key=p` 2. `MetaCache` checks that it doesn’t have `TableData` for `Table1`, initializes `TableData` for `Table1` with the list of partitions for `Table1`, and sends an RPC to the master 3. Master returns tablet locations that contain tablet locations for both `Table1` and `Table2`, because they are colocated and share the same tablets set 4. `MetaCache` updates `TableData::tablets_by_partition` for `Table1` 5. Caller gets `Tablet0` as a response to `MetaCache::LookupTabletByKey` 6. Someone calls `MetaCache::LookupTabletByKey` for `Table2` and `partition_key=p` 7. `MetaCache` checks that it doesn’t have `TableData` for `Table2` and sends an RPC to the master And with the fix, at step 4 `MetaCache` will also initialize `TableData` for `Table2` using the same partitions list which was used for `Table1` and will update `TableData::tablets_by_partition` for both tables. So, at step 7, `MetaCache` will have `TableData` for `Table2` and will respond with the tablet without RPC to the master. - Fixed `MetaCache::ProcessTabletLocations` to reuse partitions list for co-located tables - Added ClientTest.ColocatedTablesLookupTablet - Moved most frequent VLOGS from level 4 to level 5 for `MetaCache` Test Plan: For ASAN/TSAN/release/debug: ``` ybd --gtest_filter ClientTest.ColocatedTablesLookupTablet -n 100 -- -p 1 ``` Reviewers: mbautin, bogdan Reviewed By: mbautin, bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D10755
polarweasel
pushed a commit
to lizayugabyte/yugabyte-db
that referenced
this issue
Mar 9, 2021
…use partitions list for co-located tables Summary: As a result of the fix for yugabyte#6890 there was introduced a potential perf issue for the first lookup of tablet by key for colocated tables. Instead of sending 1 RPC when doing first lookup for colocated table and then reusing the result for all tables co-located with the first one, MetaCache is sending 1 more RPC each time another table co-located with the first one is queried to resolve tablet by key. Since all colocated tables share the same tablet, we can cache the locations on the first RPC to any co-located table and then reuse the result for any MetaCache::LookupTabletByKey calls for any other table co-located with the one already queried. Suppose we have colocated tables `Table1` and `Table2` sharing `Tablet0`, then behavior without the fix is the following: 1. Someone calls `MetaCache::LookupTabletByKey` for `Table1` and `partition_key=p` 2. `MetaCache` checks that it doesn’t have `TableData` for `Table1`, initializes `TableData` for `Table1` with the list of partitions for `Table1`, and sends an RPC to the master 3. Master returns tablet locations that contain tablet locations for both `Table1` and `Table2`, because they are colocated and share the same tablets set 4. `MetaCache` updates `TableData::tablets_by_partition` for `Table1` 5. Caller gets `Tablet0` as a response to `MetaCache::LookupTabletByKey` 6. Someone calls `MetaCache::LookupTabletByKey` for `Table2` and `partition_key=p` 7. `MetaCache` checks that it doesn’t have `TableData` for `Table2` and sends an RPC to the master And with the fix, at step 4 `MetaCache` will also initialize `TableData` for `Table2` using the same partitions list which was used for `Table1` and will update `TableData::tablets_by_partition` for both tables. So, at step 7, `MetaCache` will have `TableData` for `Table2` and will respond with the tablet without RPC to the master. - Fixed `MetaCache::ProcessTabletLocations` to reuse partitions list for co-located tables - Added ClientTest.ColocatedTablesLookupTablet - Moved most frequent VLOGS from level 4 to level 5 for `MetaCache` Test Plan: For ASAN/TSAN/release/debug: ``` ybd --gtest_filter ClientTest.ColocatedTablesLookupTablet -n 100 -- -p 1 ``` Reviewers: mbautin, bogdan Reviewed By: mbautin, bogdan Subscribers: ybase Differential Revision: https://phabricator.dev.yugabyte.com/D10755
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As a result of the fix for #6890 there was introduced a potential perf issue for the first lookup of tablet by key for colocated tables. Instead of sending 1 RPC when doing the first lookup for colocated table and then reusing the result for all tables co-located with the first one, MetaCache is sending 1 RPC for each table (but only for the first time, after that it reuses results).
The text was updated successfully, but these errors were encountered: