-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ORM query service #11774
Comments
I know that my opinion may not be the most popular one, but honestly, if we knock out an out of the box integration with a DB with tonnes of integrations from the get go, like: Postgresql (graphql) or Mongodb (mongodb cli), then instead of reinventing and maintaining solutions we embrace existing leading solutions with web2 world with tonnes of integration and tooling. @aaronc mention that we still need historical queries (tbh, I don't know any app except some bridges which may require it). So maybe only querying by a primary key is enough? |
I totally agree we should make it really easy to index state in conventional databases like Postgres and Mongo. However, I also think the SDK should provide decent out of the box querying which is the current status quo. The reasons in my mind for this are:
|
My thinking right now is to move forward with approach A & D together, starting with A. Not sure about the logical query layer. On one hand it is sort of better and it's not that hard to implement it naively. But also it obscures performance problems if there's no plan to improve it. So direct index access may be the simple way to get the most bang for the buck and further efforts can be placed on indexing in other dbs as @robert-zaremba suggests. |
#11791 proposes .proto definitions for a generic gRPC query service (approach A) that uses direct index access rather than logical queries |
I discussed this a bit with @robert-zaremba today and I think I agree with his point that there's a risk of postponing the inevitable need of a lot of projects to break queries out into a secondary layer as the project grows. I still think we should provide out-of-the-box queries as that will be desirable at the experimental stage. But I think it may be worth the effort to define a stable API with logical queries that can be retargetted against other DBs as a project grows. In particular, there's a risk with direct index access of encouraging clients to build too much around available on-chain indexes. And I think we actually should allow a module remove an index if it is only used by clients but not the state machine, with the understanding that this can easily be offloaded to other DBs. So I'm thinking of closing #11791 and proposing a logical query layer that is sufficent but also not too complex probably in an ADR. |
## Description Ref #11774 This proposes a generic gRPC query service that uses direct index access and can basically be implemented as a thin layer directly on top of `ormtable.Table`. There is relatively little code required to implement this but it can potentially provide a lot of value as a way to expose queries to basically everything that's using the ORM without needing to implement any custom gRPC queries. A version of this which is type safe is also proposed in #11774 but that would require additional codegen of .proto files. #11774 also proposes a logical query layer instead of direct index access but that's substantially more code to do even naively and an optimized version would take more effort. --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting)) - [ ] provided a link to the relevant issue or specification - [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules) - [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [ ] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable)
## Description This starts the implementation of approach (C) in #11774 by generating `_query.proto` files for `.proto` files which have ORM definitions. It does this using a new protoc plugin called `protoc-gen-go-cosmos-orm-proto`. Please review `bank_query.proto` and `test_schema_query.proto` as these are the outputs of the codegen. The approach taken does not attempt to do any sort of logical queries as discussed in #11774 and instead provides direct index access in the most simple and complete way that we can easily implement now. More advanced things in #11774 could be implemented later, but this should be possible to deliver relatively quickly and should allow developers to completely skip writing gRPC queries manually if they use ORM. Note that the implementation of these queries can provide merkle proofs because the mapping to state is well known. I did need to disable pulsar codegen in this PR because it doesn't support proto3 optionals which are needed here (depends on cosmos/cosmos-proto#12). Fortunately, pulsar is 100% compatible with the official google codegen and there is no problem running ORM tests against the official codegen. --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting)) - [ ] provided a link to the relevant issue or specification - [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules) - [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [ ] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - [ ] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable)
offchain indexer is a replacement for this |
User Story
As a module developer, I would like the ORM to auto-generate a query service for clients so that I don't have to manually implement gRPC queries.
Possible Designs
None of these are mutually exclusive, it's a matter of prioritizing where we focus energy.
A) Generic gRPC service
This would involve creating a single gRPC endpoint that allows querying all the ORM tables in an app and returns responses using protobuf
Any
s. One possible approach is to support a set of operators similar to what is supported in mongodb.Ex:
B) Generic JSON REST service
Same as the above but with JSON instead of protobuf to allow for a simpler more flexible query language along the lines of what is provided in mongo.
C) Generate gRPC .proto files for all tables
We could generate a query service for every ORM .proto file and dynamically serve it. For instance, for the bank example in the ORM tests, we would generate a query service somewhat like this:
This would require another layer of codegen and is more similar to what we currently have with gRPC.
D) Graphql
While graphql is a departure from gRPC, it is likely better for building client apps and would allow for getting a bunch of objects in a single query and also traversing some joins (which could be done with additional protobuf annotations). A proposed design to follow is https://www.graphile.org/postgraphile/introduction/ which automatically generates Graphql services just from SQL table definitions.
Design Considerations
Some issues to address when designing this are:
Storage Types vs Presentation Types
See regen-network/regen-ledger#930. In particular, in lots of state tables we store addresses as bytes but users are used to seeing these usually as bech32 strings. All of these fields should be annotated with
cosmos.AddressBytes
so a smart query layer could do the conversion.Logical Queries vs Direct Index Access
ORM queries in the state machine are currently done with direct index access and don't really have a separation between a logical and physical plan as more robust databases do. This is probably good for state machines generally because it is more predictable in terms of performance and gas consumption. gRPC queries are also generally written as direct index access just because that's easier to do.
An auto-generated client query layer could expose indexes directly as we currently do or include a more generic set of logical query filters like MongoDB or SQL. The latter would require some query planner even if it is fairly naive. The advantage of full logical queries are:
For these reasons, I'm leaning towards supporting logical queries with the most naive query planner that is non-trivial (meaning that indexes are selected based on some very simple rule-based heuristics and in memory filtering and sorting are the fallback for all non-trivial cases).
The text was updated successfully, but these errors were encountered: