You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some other problems that need consideration when designing a new bulk interface are:
Currently we work with streaming responses. Unfortunately, errors that happen in the queries can not be properly be propagated since the request header is already written to the wire. Our current solution with error field is problematic for users to handle.
many bulk api calls are heavy and as suspect to query timeouts, this should somehow be handled
many bulk apis produce a lot of lot on the db since at the moment all requests are fired on the db in an async manner without any limitation. This can overwhelm even a big cassandra instance.
the bulk api is kind of schema-less. It gets the values to return as python dicts and infers the structure from there. It auto flattens lists and complex datatypes. We had some issues with records that have optional fields because of that since the header of the resulting csv is inferred from the first result written to the line, which leads to nondeterministic/ timing dependent results and errors.
Currently only the first
num_pages
pages can be fetched per bulked request.Add page handle that encodes page state per bulked request.
The text was updated successfully, but these errors were encountered: