Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add GEOSEARCH support #2070

Merged
merged 6 commits into from
Oct 31, 2023
Merged

feat: Add GEOSEARCH support #2070

merged 6 commits into from
Oct 31, 2023

Conversation

azuredream
Copy link
Contributor

@romange romange requested a review from chakaz October 25, 2023 06:02
Copy link
Collaborator

@chakaz chakaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your contribution!
Please see some comments below, also feel free to point out anything I might have missed

src/redis/geo.c Show resolved Hide resolved
@@ -48,6 +54,20 @@ using MScoreResponse = std::vector<std::optional<double>>;
using ScoredMember = std::pair<std::string, double>;
using ScoredArray = std::vector<ScoredMember>;

typedef struct GeoPoint {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typedef struct GeoPoint {
struct GeoPoint {

This is c-syntax, we're in a C++ file :)

const std::string& _member)
: longitude(_longitude), latitude(_latitude), dist(_dist), score(_score), member(_member){};

} GeoPoint;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} GeoPoint;
};

double dist;
double score;
std::string member;
GeoPoint() = default;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not zero the double members of the struct.
I suggest removing both constructors, and add default initializers (e.g. double logitude = 0.0;).
As for the c'tor accepting arguments, one could initialize the struct with GeoPoint{a, b, c, ...} or even better GeoPoint{.logitude=a, .latitude=b, ...}

@@ -2599,6 +2636,296 @@ void ZSetFamily::GeoDist(CmdArgList args, ConnectionContext* cntx) {
distance_multiplier);
}

// Search all eight neighbors + self geohash box
int membersOfAllNeighbors(ConnectionContext* cntx, string_view key, const GeoHashRadius* n,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

  1. Capitalize the first character (MembersOfAllNeighbors)
  2. Add anonymous namespace wrapper around this function for internal linkage (i.e. add namespace { before and } // namespace after

bool withhash = false;

// parse arguments
size_t i = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move i to be defined as part of the loop?

src/server/zset_family.cc Show resolved Hide resolved
}
} else if (cur_arg == "COUNT") {
if (i + 1 < args.size()) {
// TODO: need a string_view impl of stoull
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absl::SimpleAtoi()

withhash = true;
else {
return (*cntx)->SendError(kSyntaxErr);
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for break after return :)

Comment on lines 2881 to 2882
auto asc_comparator = [](const GeoPoint& a, const GeoPoint& b) { return a.dist < b.dist; };
auto desc_comparator = [](const GeoPoint& a, const GeoPoint& b) { return a.dist > b.dist; };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reduce the scope of asc_comparator and desc_comparator to the smallest possible?

@azuredream
Copy link
Contributor Author

Hi @chakaz. Thanks a lot for your review. I haven't clean it up when I submitted it as draft. Your insights saved me a lot of time.

I'll accept most of your comments but have question about two: using neighbor vector, and const ref shape.
I'll have to rewrite(copy from geo.c and change the input type) almost all the GEOSEARCH related functions. I want to confirm with you before I proceed.

@chakaz
Copy link
Collaborator

chakaz commented Oct 26, 2023

Hi @chakaz. Thanks a lot for your review. I haven't clean it up when I submitted it as draft. Your insights saved me a lot of time.

I'll accept most of your comments but have question about two: using neighbor vector, and const ref shape. I'll have to rewrite(copy from geo.c and change the input type) almost all the GEOSEARCH related functions. I want to confirm with you before I proceed.

Gladly, thank you for your contribution!
But I'm not sure what your questions are?
If you accept a T& t and need to pass it to a function accepting a pointer, you could call if via foo(&t), but I'm not sure if this is what you're asking?

@romange
Copy link
Collaborator

romange commented Oct 26, 2023

If accepting constref shape requires copying functions from geo.c that otherwise we could use from there then you should use mutable pointer - I prefer avoid copying. Another option is cheating and use const_cast<...> to remove constness inside your function before calling redis functions. it's also fine.

@@ -48,6 +48,8 @@ class ZSetFamily {
struct ZRangeSpec {
std::variant<IndexInterval, ScoreInterval, LexInterval, TopNScored> interval;
RangeParams params;
ZRangeSpec() = default;
ZRangeSpec(const ScoreInterval& si, const RangeParams& rp) : interval(si), params(rp){};
Copy link
Contributor Author

@azuredream azuredream Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required by emplace_back().

@azuredream
Copy link
Contributor Author

azuredream commented Oct 26, 2023

Thanks for your advice. const_cast sounds good.
I've updated my PR. I'd appreciate it if you could review the updated PR when you have some time.

To-do list for future developments:

  1. Remove the intermediate result result_arrays.
  2. Refine the distance precision to 4 decimal places
  3. Improve the precision for longitude and latitude values.
    Redis:
    1. "Berlin"
    2. "0.0002"
      1. "13.40500205755233765"
      2. "52.51999907056681138"
        DFDB:
    1. "Berlin"
    2. "0.00017343178521311378"
      1. "13.405002057552338"
      2. "52.51999907056681"

Once this PR is merged, implementing #1548 and #1878 should be straightforward. Please feel free to assign them to me if you need.

@azuredream azuredream marked this pull request as ready for review October 27, 2023 13:50
double longitude;
double latitude;
double dist;
double score;
std::string member;
GeoPoint() = default;
GeoPoint() : longitude(0.0), latitude(0.0), dist(0.0), score(0.0), member(""){};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits:

  1. It's slightly better to use initializers when defining the members (like double logitude = 0.0;) instead of here. The reason is that future c'tors will work with 0s by default, unless modified
  2. No need to initialize member as it's a class, its c'tor will be called

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed member(""). Set it just for consistency.
A default constructor is required by ga.resize(), I can't remove GeoPoint():
../src/server/zset_family.cc:2739:21: required from here
/usr/include/c++/9/bits/stl_construct.h:75:7: error: no matching function for call to ‘dfly::{anonymous}::GeoPoint::GeoPoint()’

src/redis/geo.c Show resolved Hide resolved
// Search all eight neighbors + self geohash box
int MembersOfAllNeighbors(ConnectionContext* cntx, string_view key, const GeoHashRadius& n,
const GeoShape& shape_ref, GeoArray* ga, unsigned long limit) {
vector<GeoHashBits> neighbors(9);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you know the size in advance, you could use std::array<GeoHashBits, 9> instead of a vector. It would save a dynamic memory allocation

GeoShape* shape = &(const_cast<GeoShape&>(shape_ref));
GeoHashRadius georadius = geohashCalculateAreasByShapeWGS84(shape);
GeoArray ga;
MembersOfAllNeighbors(cntx, key, georadius, shape_ref, &ga, any ? count : 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will reply to the client in case of error, so you have to check its return value here. We can't reply twice.
Further, it seems like you only case about the error case of the function, so no need to calculate and return count there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified MembersOfAllNeighbors to return a bool value. It would return false if already replied inside.


// if no matching results, the user gets an empty reply.
if (ga.empty()) {
(*cntx)->SendNull();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return after replying

Comment on lines 2729 to 2737
double conversion = geo_ops.conversion;
uint64_t count = geo_ops.count;
bool asc = geo_ops.asc;
bool desc = geo_ops.desc;
bool any = geo_ops.any;
bool withdist = geo_ops.withdist;
bool withcoord = geo_ops.withcoord;
bool withhash = geo_ops.withhash;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove all of these variables, they don't really help but make this function more bloat. You could just use geo_ops.* instead.
For example, conversion is only used once (with a fairly short line)

Comment on lines 73 to 74
bool asc = 0;
bool desc = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an enum like:

enum class Sorting { kUnsorted, kAsc, kDesc };

And replace these members with Sorting sorting;?

This would be easier to read than a bunch of booleans, and also will only represent valid cases (with this code - both could theoretically be true, which doesn't make sense)

Comment on lines 2752 to 2757
if (count > 0) {
std::partial_sort(ga.begin(), ga.begin() + count, ga.end(), asc_comparator);
ga.resize(count);
} else {
std::sort(ga.begin(), ga.end(), asc_comparator);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create a function (in the anonymous namespace) with this, which accepts a comparator (probably of type absl::FunctionRef<bool(const GeoPoint&, const GeoPoint&)>)?
This way you both shorten this function, and remove code copy-paste

An alternative would be to create a function which will take the above-mentioned enum and sort internally. Something like:

void SortIfNeeded(GeoArray* ga, Sorting sorting) {
  if (sorting == kUnsorted)
    return;

  auto comparator = [&](const GeoPoint& a, const GeoPoint& b) {
    if (sorting == Sorting::kAsc) {
      return a.dist < b.dist;
    } else {
      DCHECK(sorting == Sorting::kDesc);
      return a.dist > b.dist;
    }
  };

  if (count > 0) {
    std::partial_sort(ga.begin(), ga.begin() + count, ga.end(), comparator);
    ga.resize(count);
  } else {
    std::sort(ga.begin(), ga.end(), comparator);
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created SortIfNeeded as you suggested:)

src/server/zset_family.cc Show resolved Hide resolved
@azuredream
Copy link
Contributor Author

PR updated. Thanks again for your review and suggestions.

@romange romange requested a review from chakaz October 31, 2023 07:22
Copy link
Collaborator

@chakaz chakaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo, thank you again for your contribution!
(and for withstanding my comments :)

@chakaz chakaz merged commit 05919ef into dragonflydb:main Oct 31, 2023
7 checks passed
@chakaz chakaz mentioned this pull request Oct 31, 2023
@romange
Copy link
Collaborator

romange commented Oct 31, 2023

thank you @azuredream !

@piyongcai-liucai
Copy link

FULL SUPPORT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants