-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi get api to the high level rest client #27337
Conversation
Can we contribute to this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot @martijnvg sorry it took me ages to review this. I left some comments and also asked @cbuescher to have a look too.
@@ -283,6 +285,25 @@ public void getAsync(GetRequest getRequest, ActionListener<GetResponse> listener | |||
performRequestAsyncAndParseEntity(getRequest, Request::get, GetResponse::fromXContent, listener, singleton(404), headers); | |||
} | |||
|
|||
/** | |||
* Retrieves multi documents by id using the Multi Get API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/multi/multiple ?
} | ||
|
||
/** | ||
* Asynchronously retrieves multi documents by id using the Multi Get API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/multi/multiple ?
} | ||
} | ||
|
||
int numberOfRequests = randomIntBetween(0, 32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when we add no items?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test it is fine, but on the server side we do fail if no items have been specified (see MultiGetRequest#validate(...)
).
} else { | ||
fetchSourceContext = new FetchSourceContext(false); | ||
} | ||
item.fetchSourceContext(fetchSourceContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you can use randomizeFetchSourceContextParams
?
break; | ||
case START_ARRAY: | ||
if (Fields.DOCS.equals(currentFieldName) == false) { | ||
throw new ElasticsearchParseException("Unexpected field [" + currentFieldName + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that throwing exceptions here is a good idea. We should rather be lenient to ensure forward compatibility. Say we add another array at the same level of docs (not likely, but still...) we should not throw error if your own client receives such a response. Rather ignore it.
items.add(new MultiGetItemResponse(getResponse, null)); | ||
} | ||
} else { | ||
throw new ElasticsearchParseException("Unexpected token [" + innerToken + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too, throwing is a problem.
// then creating a parser from that is a bit inefficient, but makes the parsing code less | ||
// complex. I think this is the right trade off. Unless we introduce a new xcontent format | ||
// in mget response for the hl rest client that is enabled via a parameter that renders | ||
// an unambiguous format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me a bit sad to have to go with maps here, yet I see why it is hard to do it differently.
@cbuescher any idea on making this work without maps? Here is an example response for clarify:
{
"docs" : [
{
"_index" : "my_index",
"_type" : "my_type",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"user" : [
{
"field1" : 1,
"field2" : 2
},
{
"field1" : 3,
"field2" : 4
}
]
}
},
{
"_index" : "test",
"_type" : "type",
"_id" : "2",
"error" : {
"root_cause" : [
{
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "test",
"index_uuid" : "_na_",
"index" : "test"
}
],
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "test",
"index_uuid" : "_na_",
"index" : "test"
}
}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand after a first glace the difference here is that the objects in docs
can eithe be failures or sth. that can already be parsed by GetResult#fromXContent(). The only thing the failure items have is an error
field. Ideally we could reuse the GetResult parser so it optionally accepts an error
field. Unfortunaltely the current GetResult parser isn't easily extendable, but could we add a flag to it that allows optionally parsing the error
field and have GetResult contain an optional Failure?
Another option would be to rewrite GetResult#fromXContent to use Object parser, then we could have a common "declareFields(Parser ...)" method that sets up the parser with the parts that GetResult needs and reuse that setup method plus a parser for the "error" field for a specialized MultiGetResult or something.
I think that way we could avoid parsing to a generic map here, which I would also appreciate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna @cbuescher I'll look into changing the GetResult#fromXContent(...)
method, so that we don't have to parse into a generic map here.
case EXTERNAL_GTE: | ||
return "external_gte"; | ||
case FORCE: | ||
return "force"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just do versionType.toLowercase(Locale.ROOT)
?
MultiGetResponse expected = createTestInstance(); | ||
XContentType xContentType = randomFrom(XContentType.values()); | ||
BytesReference shuffled = toShuffledXContent(expected, xContentType, ToXContent.EMPTY_PARAMS, false); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that inserting random fields here would reveal problems on the parsing side with the current code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried inserting random fields, but then the tests fails, because in GetResult#fromXContentEmbedded(...)
in line 294 adds any unknow json field as field to retrieve, this causes the expected response item to be not equal with the actial response item.
} else { | ||
throw new IllegalArgumentException("unsupported token [" + token + "]"); | ||
} | ||
return xContentBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ops, I wonder why we are finding this only now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because otherwise FetchSourceContextTests
test fails. I guess we never shuffled xcontent that had only a top level value (which is valid).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still wonder why this would be a problem. At least as far as I understand this method should only take whole xContent Objects or Arrays, and they should all parse to a map fine (even with just one field). Or are you talking about things like { "foo" }
. Is that valid at all? I would like to take a look at the failing test before adding this here, I think this method should rather check that the start token is either Array- or Object-Start and otherwise reject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are you talking about things like { "foo" }. Is that valid at all?
Yes, that is valid.
I noticed that we didn't have a isolated unit test for FetchSourceContext
. This class serializes a bool value directly without adding an enclosing json object. That is because fetch source context is part of a search request under a json field. This causes FetchSourceContextTests
to fail, because shuffleXContent(...)
method returned an empty XContentBuilder
. This made me change the shuffleXContent(...)
method.
I think I can add an assert here to check whether a value is top level and otherwise fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at FetchSourceContext and I think its toXContent() method is kind of problematic in a sense that it sometimes renders a complete object (with start/end Object) and sometimes only a value. Given that FetchSourceContext extends ToXContentObject, I think the later is a case that shouldn't happen. Should we move this out of this PR or is FetchSourceContextTests needed? I don't think I would like to special-case the shuffle methods for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this test is not needed for adding mget api to high level rest client. So 👍 to move this test out and its related changes out of this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martijnvg @javanna I took a first look regarding the parsing to map and left a first idea, I will also take a look at the rest of the PR later today.
17484eb
to
a21b3ec
Compare
@javanna @cbuescher I've changed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martijnvg thanks, I did another round of reviews, I still need to take a short deeper look at the tests and will add those comments later. But the rest looks good already to me minus a few small things.
@@ -123,6 +129,11 @@ public String id() { | |||
return this.id; | |||
} | |||
|
|||
public Item id(String id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this setter needed? I experimentally removed it and everything seems fine, in which case I would like to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this can be removed. This was a left over from the parser.
@@ -371,9 +403,9 @@ public MultiGetRequest add(@Nullable String defaultIndex, @Nullable String defau | |||
|
|||
public static void parseDocuments(XContentParser parser, List<Item> items, @Nullable String defaultIndex, @Nullable String defaultType, @Nullable String[] defaultFields, @Nullable FetchSourceContext defaultFetchSource, @Nullable String defaultRouting, boolean allowExplicitIndex) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be private? Seems only be used from "add" and the other parseDocuments() method.
@@ -493,8 +525,8 @@ public static void parseDocuments(XContentParser parser, List<Item> items) throw | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we need public static void parseDocuments(XContentParser parser, List<Item> items)
in the code? I think we can remove this and only leave "add(...)" as an entry point to parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The add(...)
method is pretty large already, I like not to make it larger than it already is by adding the that is now in parseDocuments(...)
to it.
|
||
public class MultiGetResponse extends ActionResponse implements Iterable<MultiGetItemResponse>, ToXContentObject { | ||
|
||
private static final ParseField INDEX = new ParseField(Fields._INDEX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we remove the Fields constants and use ParseField.getPreferredName() where we currently use the Strings (like in toXContent())
String id = null; | ||
ElasticsearchException exception = null; | ||
GetResult getResult = null; | ||
item: for (token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, haven't seen breaks with targets in a while. This might be a bit too sneaky, I think I would like pulling out the item parser for better readability.
String currentFieldName = null; | ||
List<MultiGetItemResponse> items = new ArrayList<>(); | ||
for (Token token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { | ||
switch (token) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this switch have a default case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is no default case. Normally I would add a default case that throws an exception, but we should be lenient here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, can you still add an empty default case, maybe with a comment that we ignore those cases? I get a warning here that could be avoided.
ElasticsearchException exception = null; | ||
GetResult getResult = null; | ||
item: for (token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { | ||
switch (token) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, maybe have a default case? Not sure if it needs to do anything (I think parsing should be lenient for the response parsing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is no default case. Normally I would add a default case that throws an exception, but we should be lenient here.
break; | ||
case START_OBJECT: | ||
if (ERROR.match(currentFieldName)) { | ||
exception = ElasticsearchException.fromXContent(parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would be possible/worthwhile to augment GetResult (and its parsing method) with an optional exception field and then only use GetResult.fromXContentEmbedded() for both cases. That would add something to GetResult that is unused in the normal Get-API, but it would simplify things here. I'm on the fence about this idea, maybe @javanna can comment on this too.
2ae4a63
to
67bebfa
Compare
@cbuescher Thanks for the review! I've updated the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martijnvg Hi, thanks for the updates, I left some super minor nits but other than that looks good to me. I don't know if @javanna wants to give this another look though.
String id = null; | ||
ElasticsearchException exception = null; | ||
GetResult getResult = null; | ||
item: for (Token token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no need for the loop label anymore
String currentFieldName = null; | ||
List<MultiGetItemResponse> items = new ArrayList<>(); | ||
for (Token token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { | ||
switch (token) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, can you still add an empty default case, maybe with a comment that we ignore those cases? I get a warning here that could be avoided.
ElasticsearchException exception = null; | ||
GetResult getResult = null; | ||
item: for (Token token = parser.nextToken(); token != Token.END_OBJECT; token = parser.nextToken()) { | ||
switch (token) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe empty default wt. comment here as well.
no need to wait for my final LGTM here, I trust you @cbuescher @martijnvg ! please merge it in, although it has conflicts now unfortunately. |
ec3a2ee
to
2b8c1d6
Compare
2b8c1d6
to
853f7e8
Compare
Is this in a released version? if yes, can you please point me to the documentation? |
@shabtaisharon according to the labels on this issue this has been merged to 6.2. However, it seems the documentation was only added in 6.x so far: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/6.x/java-rest-high-document-multi-get.html, but I think that should reflect what is there in 6.2 already. |
@cbuescher thank you! |
Relates to #27205