-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HLRC API for _termvectors #33447
HLRC API for _termvectors #33447
Conversation
Pinging @elastic/es-core-infra |
relates to elastic#27205
76c90c5
to
2272642
Compare
if (docBuilder != null) { | ||
BytesReference doc = BytesReference.bytes(docBuilder); | ||
try (InputStream stream = doc.streamInput()) { | ||
builder.rawField("doc", stream, docBuilder.contentType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the name of this method! It makes me think we're copying the field raw without checking the content type but we are checking it.
private final long docVersion; | ||
private final boolean found; | ||
private final long tookInMillis; | ||
private List<TermVector> termVectorList = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a little weird to have just this part be mutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for the ConstructingObjectParser
to pass in a NamedObjects
constructorArg
? Right now on line 67 is where its being mutated.
client/rest-high-level/src/main/java/org/elasticsearch/client/core/TermVectorsResponse.java
Show resolved
Hide resolved
public Term() {}; | ||
|
||
public static Term fromXContent(XContentParser parser, String term) { | ||
Term t = TERMPARSER.apply(parser, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put TERMPARSER
into this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same for the other parsers, I think.
private float score = -1f; | ||
private List<Token> tokens = null; | ||
|
||
public Term() {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this one mutable when the others are immutable? Maybe make all immutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work thus far. Had some comments / nits to address. It looks mostly done tho.
@@ -1122,6 +1123,26 @@ static Request analyze(AnalyzeRequest request) throws IOException { | |||
return req; | |||
} | |||
|
|||
static Request termVectors(TermVectorsRequest tvrequest) throws IOException { | |||
String endpoint = "_termvectors"; | |||
if (tvrequest.getId() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the EndpointBuilder() gladly accepts nulls in its addPathPart(...)
method. You dont need to worry about passing a null value in, as it just wont put it in the path.
} | ||
Request request = new Request(HttpGet.METHOD_NAME, endpoint); | ||
Params params = new Params(request); | ||
if (tvrequest.getRouting() != null) params.withRouting(tvrequest.getRouting()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the params methods all call putParam
and that does a null check, and will not add it if its null. So no need to check here.
|
||
if (filterSettings != null) { | ||
builder.startObject("filter"); | ||
if (filterSettings.containsKey("max_num_terms")) builder.field("max_num_terms", filterSettings.get("max_num_terms")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a helper method here to reduce the copy/paste
private final long docVersion; | ||
private final boolean found; | ||
private final long tookInMillis; | ||
private List<TermVector> termVectorList = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for the ConstructingObjectParser
to pass in a NamedObjects
constructorArg
? Right now on line 67 is where its being mutated.
} | ||
|
||
@Override | ||
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we need this
import java.util.function.Predicate; | ||
|
||
|
||
public class TermVectorsResponseTests extends AbstractXContentTestCase<TermVectorsResponse> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when removing the ToXContent, you will have to manually create a parser to test. This could be nasty on such a large class hierarchy. Im sorry, but its the best we have now. We need to come up with a better way. You should be able to move the code out of the response and just use it in a test.
@@ -61,7 +61,7 @@ public DeleteByQueryRequest(String... indices) { | |||
this(new SearchRequest(indices)); | |||
} | |||
|
|||
DeleteByQueryRequest(SearchRequest search) { | |||
public DeleteByQueryRequest(SearchRequest search) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended? I dont see any uses of it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably drop this, yeah.
- use ConstructorObjectParser to make object fields final
…rm-vectors-new-response-format
- move toXContent section from TermVectorsResponse to TermVectorsResponseTests - refactors docs to use concise version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of minor stuff but I think it is pretty much done!
this.id = docId; | ||
} | ||
|
||
public String getIndex() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you put these below the other ctor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe add javadoc for them like you did with the other methods?
String[] filterSettingNames = | ||
{"max_num_terms", "min_term_freq", "max_term_freq", "min_doc_freq", "max_doc_freq", "min_word_length", "max_word_length"}; | ||
for (String settingName : filterSettingNames) { | ||
if (filterSettings.containsKey(settingName)) builder.field(settingName, filterSettings.get(settingName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be a class rather than a free form map? It looks like this'd silently not serialize some stuff and that seems bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these parameters are optional, and a request doesn't require any of them, that is why I think it is fine to have it as a map.
private final List<TermVector> termVectorList; | ||
|
||
public TermVectorsResponse( | ||
String index, String type, String id, long version, boolean found, long tookInMillis, List<TermVector> termVectorList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you indent this one more time so the parameters don't line up with the body?
this.found = found; | ||
this.tookInMillis = tookInMillis; | ||
if (termVectorList != null) { | ||
Collections.sort(termVectorList, Comparator.comparing(TermVector::getFieldName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably sort in the parser rather than the ctor. It just feels a little less surprising.
|
||
@SuppressWarnings("unchecked") | ||
private static ConstructingObjectParser<TermVectorsResponse, Void> PARSER = new ConstructingObjectParser<>( | ||
"term_vectors", true, args -> new TermVectorsResponse((String) args[0], (String) args[1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably make this a block with a return
statement because that'd make it a bit easier to read. I think.
return totalTermFreq; | ||
} | ||
|
||
public Float getScore(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moar javadoc!
} | ||
|
||
private void toXContent(TermVectorsResponse response, XContentBuilder builder) throws IOException { | ||
builder.startObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
funny indentation
@@ -1503,6 +1507,120 @@ public void afterBulk(long executionId, BulkRequest request, Throwable failure) | |||
} | |||
} | |||
|
|||
public void testTermVectors() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think getting the term vectors is part of CRUD. I don't know what it is, but I don't think it is CRUD. I guess if there isn't any better place, here is good. But maybe leave a comment?
int position = token.getPosition(); // <18> | ||
int startOffset = token.getStartOffset(); // <19> | ||
int endOffset = token.getEndOffset(); // <20> | ||
String payload = token.getPayload(); // <21> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow that is a lot of callouts! Maybe it'd be easier to read the docs if you split it? I'm not sure how that'd look, but 21 callouts is going to be super hard to read!
@@ -61,7 +61,7 @@ public DeleteByQueryRequest(String... indices) { | |||
this(new SearchRequest(indices)); | |||
} | |||
|
|||
DeleteByQueryRequest(SearchRequest search) { | |||
public DeleteByQueryRequest(SearchRequest search) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably drop this, yeah.
…rm-vectors-new-response-format
@nik9000 Thanks for such a detailed feedback, Nik. I have addressed all your comments, can you please continue the review. The only comment I have not addressed is to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one super minor nit. also, wow you picked a giant client call!! Thank you for taking the time to put the new classes into the core folder in client instead of using the existing server ones. Itll be one less thing to do to decouple. <3 <3
* @param docId - id of the document | ||
*/ | ||
public TermVectorsRequest(String index, String type, String docId) { | ||
this.index = index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this(index, type);
this.id = docId;
Sort lists for equality in parsers
…rm-vectors-new-response-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
relates to elastic#33447
* HLRC API for _termvectors relates to #27205
relates to #27205