-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Queryable Sub-Trees #22
Comments
Hey @vassudanagunta, Yeah the API could certainly be more flexible in general - it's really the minimum needed for specific use case (go-memdb). That said, a prefix based lookup can in most cases give you essentially the same thing - it doesn't expose a subtree's actual node directly as it's wrapped in the iterator but generally allows you to walk any subset of the tree from a given node. Given that the only thing you can do with a Node currently is further prefix/exact match based lookups it's kinda of equivalent but if you have a use case where exposing the actual Node API of a subtree is convenient then it's probably not hard to add. The Iterator itself has a reference to the node that was sought with any call to SeekPrefix so you could just make it return that node explicitly. I've not thought through whether this is a good idea or not in detail but off the to of my head it seems reasonable and likely a smallish change. |
Thanks for the detail reply, @banks. My use-case is fast prefix-based lookups of file/URL paths, as well as glob matching, which I can implement by seeking to glob's wildcard-free prefix, then walking the tree from there to find matches on the wildcard-containing suffix. I will often have to do repeated lookups (not walks) relative to a specific node (e.g. a directory). You might call these subprefix lookups. Having to do it from the root every time means several points of redundancy: (1) having to concatenate the prefix (the directory/base path) and the subprefix (the relative path), (2) all the redundant string-[]byte conversions and then (3) go-immutable-radix having to rewalk the tree down to the prefix, even though it already did that for me many times just a microsecond ago. Being a Go newb and only an open source programmer on the side at that, I'm not sure which makes most sense:
|
I think that use case sounds legit!
It would almost be easy - just adding a `Node()` method to the iterator in
iter.go that returns `i.node` would kinda work for some of your examples.
The awkward part is that because the iterator is only designed to iterate
the leaves and not the internal nodes, it would depend on the struture of
your keys and which nodes have been collapsed by prefix compression as to
exactly what worked which is a bit ugly to expose to the user to reason
about.
It might be possible to use rawIterator (raw_iter.go) to expose the nodes
in a nicer way.
Overall though whether it's worth trying to build a nice-enough API that is
also more efficient that just doing the lookup every time is a hard call -
I'd probably not do that until I had real benchmark data to show it was
important unless you want to do it just for the learning experience!
…On Thu, May 23, 2019 at 7:07 PM Vas Sudanagunta ***@***.***> wrote:
Thanks for the detail reply, @banks <https://github.com/banks>.
My use-case is fast prefix-based lookups of file/URL paths, as well as
glob matching, which I can implement by seeking to glob's wildcard-free
prefix, then walking the tree from there to find matches on the
wildcard-containing suffix.
I will often have to do repeated lookups (not walks) relative to a
specific node (e.g. a directory). You might call these *subprefix*
lookups. Having to do it from the root every time means two points of
redundancy: (1) having to concatenate the prefix (the directory/base path)
and the subprefix (the relative path) and then (2) go-immutable-radix
having to rewalk the tree down to the prefix, even though it already did
that for me many times just a microsecond ago.
Being a Go newb and only a programmer on the side at that, I'm not sure
which makes most sense:
1. The cost is relatively insignificant; just pay it!
2. Update go-immutable-radix to provide a way of getting a public
handle on an arbitrary Node. (I could do that, with tests, and submit a PR)
3. Roll my own file-path optimized radix tree-like index, because for
file paths "/" nodes are special.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#22?email_source=notifications&email_token=AAA5QU7EQ27MSIWNNZT2Z43PW3MMHA5CNFSM4F2RPA2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWDA64Q#issuecomment-495325042>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAA5QU72LE5GROST7YKQVWDPW3MMHANCNFSM4F2RPA2A>
.
|
@banks I have too many things to learn! I wanted to know if it would be relatively trivial (It didn't seem trivial to figure out if it would be trivial; perhaps that should have been my first clue. Thanks again for your help and for your work. |
I see that the implementation is recursive, but this isn't exposed in the public API.
For example:
Get(k []byte)
is defined on*Node
, but the only Node that is publicly accessible is the root node. It would be powerful to be able to make queries relative to any Node (e.g.Get("xyz")
on the "abc" Node finding "abcxyz").I might be able to figure it out reverse engineering the code, but I'm hoping the maintainers know whether this is something that is (a) already supported; just need to provide public access to non-root nodes, (b) not supported but could be with some reasonalble changes or (c) technically not possible given the current implementation.
This is a great library. I'm trying to do some great open source code with it.
The text was updated successfully, but these errors were encountered: