-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for get_many
#16
Conversation
for key in keys: | ||
dest_region = self._find_hosting_region(table, key) | ||
# we must call each region server, which can server many key ranges | ||
grouped_by_server[dest_region.region_client.host][dest_region].append(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format for using this call is pretty straightforward but context if you're not too familiar with hbase:
Each region server hosts (in our case, many) range of keys for a given table. We need to organize each key into the appropriate region server key range. Once we've done that, we can re-group these by server and send a request for all keys matching any key range supported by that server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brilliant
@@ -128,6 +129,8 @@ def _send_request(self, rq, lock_timeout=10): | |||
|
|||
# send and receive the request | |||
future = self.thread_pool.submit(self.send_and_receive_rpc, my_id, rq, to_send) | |||
if _async: | |||
return future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We utilize a threadpool here but immediately block. This would work in gevent world but removes the entire point of the threadpool in normal python execution.
@@ -240,7 +246,7 @@ def NewClient(host, port, pool_size, secondary=False, call_timeout=60): | |||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | |||
s.connect((c.host, int(port))) | |||
_send_hello(s) | |||
s.settimeout(2) | |||
s.settimeout(call_timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would submit these onto a threadpool and then immediately block for call_timeout
for 60 seconds. However, the actual timeout on the socket was hardcoded to 2 seconds. So you never would wait for whatever your call_timeout
was.
for key in keys: | ||
dest_region = self._find_hosting_region(table, key) | ||
# we must call each region server, which can server many key ranges | ||
grouped_by_server[dest_region.region_client.host][dest_region].append(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brilliant
Add support for bulk gets to hbase. Makes use of a protobuf
MultiRequest
that I discovered while digging around.Also fixed a host of bugs, I'll add some comments inline.