Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(search): STOPWORDS #3851

Merged
merged 4 commits into from
Oct 10, 2024
Merged

Conversation

dranikpg
Copy link
Contributor

@dranikpg dranikpg commented Oct 2, 2024

Support for FT.STOPWORDS

@dranikpg dranikpg force-pushed the search-stopword-1 branch 2 times, most recently from b5a7db7 to ada95d0 Compare October 3, 2024 11:56
Signed-off-by: Vladislav Oleshko <[email protected]>
@dranikpg dranikpg changed the title feat(search/core): stopwords feat(search): STOPWORDS Oct 3, 2024
Comment on lines 97 to 98
const Schema* schema_;
const IndicesOptions* options_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same shared_ptr vs ptr question here. Everything points esentially to shared_ptr that is created in the coordinator when parsing the index definition. Lifetimes are ensured by ShardDocIndex that create the FieldIndices

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you feel that shared_ptr is more correct because of lack of single owner, then lets switch to shared_ptr

Comment on lines -185 to +187
bool was_built_ = false;
std::shared_ptr<const DocIndex> base_;
search::FieldIndices indices_;
std::optional<search::FieldIndices> indices_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More elegant, creating a dummy instance with nullptrs was just an ugly hack

Comment on lines +98 to +99
private:
const StopWords* stopwords_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Not sure if to make it a shared_ptr (can be easily created from IndexOptions from an aliasing constructor)
  2. Const-ptr to indicate longevity... or use a reference instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer a const ref.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you ask then? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I forgot to make this a constref 😨 Never mind

Comment on lines 458 to 465
IndicesOptions::IndicesOptions() {
static absl::flat_hash_set<std::string> kDefaultStopwords{
"a", "is", "the", "an", "and", "are", "as", "at", "be", "but", "by",
"for", "if", "in", "into", "it", "no", "not", "of", "on", "or", "such",
"that", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"};

stopwords_ = kDefaultStopwords;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this should be part of the core module or the server module 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine here.

@dranikpg dranikpg marked this pull request as ready for review October 5, 2024 07:14
@dranikpg
Copy link
Contributor Author

dranikpg commented Oct 5, 2024

Remebered: Need to add and test persistence 🤓

@romange romange self-requested a review October 6, 2024 05:36
stopwords_ = kDefaultStopwords;
}

FieldIndices::FieldIndices(const Schema* schema, const IndicesOptions* options,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the ownership stays with the caller while this object lives, it's better using const refs

@dranikpg dranikpg requested a review from romange October 9, 2024 17:19
@dranikpg dranikpg merged commit e71f083 into dragonflydb:main Oct 10, 2024
9 checks passed
@dranikpg dranikpg deleted the search-stopword-1 branch October 20, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants