Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove JsonTermWriter #2238

Merged
merged 2 commits into from
Apr 18, 2024
Merged

remove JsonTermWriter #2238

merged 2 commits into from
Apr 18, 2024

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Nov 3, 2023

No description provided.

@PSeitz PSeitz requested a review from fulmicoton November 7, 2023 14:55
src/core/json_utils.rs Outdated Show resolved Hide resolved
json_term_writer,
DateTime::from_utc(dt_utc),
));
term.append_type_and_fast_value(DateTime::from_utc(dt_utc));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is a bug or a bug in waiting.

For this to work, we need term to not contain any value at this point.

@PSeitz PSeitz force-pushed the path_collector branch 4 times, most recently from 2afa480 to ce04b98 Compare November 8, 2023 04:34
phrase: &str,
) -> Option<Term> {
/// Tries to infer a JSON type from a string and append it to the term.
pub(crate) fn convert_to_fast_value_and_append(term: &Term, phrase: &str) -> Option<Term> {
Copy link
Collaborator

@fulmicoton fulmicoton Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the prototype and/or the impl is a bit silly / misleading.
We pass a reference but always start by cloning it... even though we sometime return None in the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cloned it, because it's not performance critical and simplifies the code

src/schema/term.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR contains several pitfalls. It cannot be merged as is.

src/core/json_utils.rs Outdated Show resolved Hide resolved
/// Append a type marker + fast value to a term.
/// This is used in JSON type to append a fast value after the path.
///
/// It will not clear existing bytes.
pub(crate) fn append_type_and_fast_value<T: FastValue>(&mut self, val: T) {
self.0.push(T::to_type().to_code());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have debug assert here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In indexing we don't have a path, but a path id, so the debug_assert doesn't work here. It's also atypical for the Term methods, they are just building blocks, and the caller has to know the Term state

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment made me think it was only for json. So it is not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but there are two JSON currently
Indexing: [FieldId][PathId][Value]
Query: [FieldId][JsonPath][JsonPathEnd][Value]

When we switch to [PathId][Value] for all data during Indexing, we can split Term into Term and IndexingTerm with a clearer API.

@PSeitz PSeitz force-pushed the path_collector branch 3 times, most recently from 982bc22 to b5d07c1 Compare November 15, 2023 02:07
@PSeitz PSeitz requested a review from fulmicoton November 15, 2023 05:47
remove JsonTermWriter
remove path truncation logic, add assertion
@PSeitz PSeitz merged commit 0e9fced into main Apr 18, 2024
4 checks passed
@PSeitz PSeitz deleted the path_collector branch April 18, 2024 14:28
philippemnoel pushed a commit to paradedb/tantivy that referenced this pull request Aug 31, 2024
* remove JsonTermWriter

remove JsonTermWriter
remove path truncation logic, add assertion

* fix json_path_writer add sep logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants