-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling DynamoDB "unprocessed_items" from batch_write_item is too difficult #815
Comments
I did the following and it worked as expected: items = (1..25).map do |n|
{
put_request: {
item: {
"id" => "value#{n}"
}
}
}
end
unprocessed_items = nil
100.times do
# the target table I created with 1 write capacity units to ensure I will be throttled on batch write
r = dynamodb.batch_write_item(request_items: { "aws-sdk-slow": items })
if r.unprocessed_items.count > 0
unprocessed_items = r.unprocessed_items
break
end
end
dynamodb.batch_write_item(request_items: unprocessed_items) Can you share an example of what you are doing that requires the customization above? |
Hmmm that's odd. OK here's the error, which includes output from the code below (excuse the mess: you can see the working version in the stuff that's commented out): # Note AWS API limit of 25 puts per request so we batch the put
# and add a delay between batches
# TODO: limit of 1MB per request in total not enforced here
def self.put_items(table_name, items)
__start_time = Time.now if DEBUG
in_batches_of_25(items) do |items_batch, last_iteration|
put_requests = items_batch.map do |item|
{
put_request: {
item: item
}
}
end
request_items = {
table_name => put_requests
}
loop do
response = client.batch_write_item(request_items: request_items)
sleep NICENESS_PAUSE unless last_iteration
break if response.unprocessed_items.size == 0
request_items = response.unprocessed_items
# request_items = unprocessed_items_to_request_items(response.unprocessed_items)
Rails.logger.info("request_items")
Rails.logger.info(request_items)
sleep NICENESS_PAUSE * 2 # sleep again: we're not managing to store all items
end
end
ensure
::Rails.logger.debug("put_items #{table_name} (#{items.length}): #{(Time.now() - __start_time)*1000}ms") if DEBUG
end
# def self.unprocessed_items_to_request_items(unprocessed_items)
# unprocessed_items.as_json.deep_transform_keys do |key|
# if %w{ item put_request n s b ss ns bs m l null bool delete_request }.include? key
# key.to_sym
# else
# key
# end
# end
# end
|
Can you pretty-print or inspect the |
It's in the log output above: you have to scroll to the right a bit. Let me know if that's enough? |
Yup, thats what I was looking for. I didn't notice the wide scroll. Have you potentially disabled parameter conversion when you construct the client? Also, as a work-around, this should perform the nested conversion you require without having to white-list specific keys for the string maps. request_items = response.data.to_h[:unprocessed_items] I want to stress that this should not be required and if there is a bug when param conversion is disabled, then I'll want to address this. |
Ah, my config is:
Mostly because that was the easiest way to migrate from the v1 SDK if I remember right. I'll try that workaround out: looks way better than my hack, even if it's an interim thing. :-) |
I've been able now to reproduce the issue you are experiencing by disabling param conversion. If you re-enable the Param conversion allows you to pass in a wider set of input values. For example, it allows you to pass in strings for dates and it will attempt to parse then into Time objects when necessary. When you disable param conversion, you are required to supply the proper type, e.g. That said, the param validator should be updated to work with structs natively without conversion. I'll take a look at this. For now, my recommendation would be to set convert_params to true and you should be good. |
Resolved an issue where structure parameter values, such as those returned in responses, could not be serialized as input parameters when parameter conversion is disabled. This fix resolves the issue by changing the serializes to all enumerate structure/hash values in a consistent manor, using #each_pair. Also added a few minor helpers to `Aws::Structure` making it quack more like a Hash object. Fixes #815
The commit above is available in the The fix makes it possible to round-trip the response data structures as input without parameter conversion. Until then, enabling parameter conversion or calling #to_h on the response data will work around the issue. Thank you for reporting the issue. |
Fantastic, thanks for that. (I'll take a look at my use of the param conversion a bit later this week, so thanks for the pointer there too.) |
DynamoDB batch_write_item API can fail and return "unprocessed_items". The docs say that the format of this returned data can be used in a subsequent batch_write_item, but that's not quite true. You can't simply do something like
client.batch_write_item(request_items: response.unprocessed_items)
because the response is not a hash as expected in the params.What I've ended up doing is something like
where that helper method munges the response structure like this (
deep_transform_keys
comes from ActiveSupport):(This is required because the table_name key to the hash supplied to put_request needs to be a string. There may be other cases that I've not encountered yet.)
It would be quite good not to require dirty hacks like this, if possible. :-)
(This is SDKv2 by the way)
The text was updated successfully, but these errors were encountered: