-
-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Associations & Joins #319
Associations & Joins #319
Conversation
This looks amazing! 🤩 Is there any chance this API won’t make it in GRDB 3? What is the big-picture ETA for GRDB 3? The best way for me to start playing with this now is to integrate it into the application we’re working on. But I can only do so if I won’t have to completely rip the join code out later on and if GRDB 3 is released production-read-ish before we release. |
Thanks David! Yes, it will ship in GRDB3. I'm now confident enough that the design is not a dead-end. As the benevolent dictator of this lib, and after two failed attempts at addressing joins, I now that this is the way to go. I have four concerns on the proposed API:
Weeks. There are unchecked items in TODO.md in the GRDB3 section. Not all of them will ship, because I may change my mind. But none of them is as hard as associations. Yet they all take a little of my free time, and there is also a lot of documentation to update, including the Medium articles. And let's not forget the precious RxGRDB.
I understand. It's difficult to commit to beta software. A welcome contribution would be a witty QA mind. |
Great!
That sounds promising. As we are not planning on releasing before June, I feel confident going forward with integrating those new APIs and beta-testing them at the same time. Worst case scenario, GRDB 3 won't be completely out yet, but I'm confident enough with the level of quality of GRDB to go ahead with that. I'll try to document my comments here as I use those new APIs. |
@freak4pc, since this PR may also matches some of your interests, it may a be a good idea to have your opinion, too? |
Hey @groue , Awesome work! I started reading but had to stop to get some work done, here is some of the feedback / questions I had time to go through:
How would you feel abbout
given the Also, Edit: Noticed its a static method on the record, seems fine to me. I'm trying to think if there's a "prettier" syntax but I think this actually looks great. For this API: What is the expected output here? I find the name a bit non-descriptive. Would it get a book including its optional author if exists? Would it get a Book that itself includes some author? Not sure If I missed this but is there a way to fetch an object along with its relation? I saw the "BookInfo" structure that seems to try and solve that, but introducing a third data type isn't necessarily what I'd imagine doing (if it's at all possible). Docs-wise I would bold terms specific to the GRDB codebase such as "The BelongsTo association between a book...", personal taste :) I had a problem with the fact tables must be singular, until I got to the last row there and noticed you can actually define a custom name. I'm not sure I'm fully clear on the documentation there, it notes an example regarding "plural" but the written code is singular. What am I missing? The syntax for manually defining a foreign key is a bit verbose/strange IMO :
Why use an enum if using static lets? Could I use a struct if I'd want to? Why does that "ForeignKey" initializer accept an array? |
I find this solution fairly elegant. I guess we could have a solution where a tuple is returned. But I actually quite like being able to define my own structured types with the types I expected to get back.
I've also had a personal convention of plural table names, but I'm okay changing my convention if it means easier support for joins. Concerning the example in the documentation, I think its just a typo.
I'm fairly sure you're free to use what you want: the APIs just expect a |
@hartbit For the first piece, I would still imagine, if we finally have a Join, that it would actually be a Join. Right now it seems you will still need to sort of do two separate fetches (one for the parent and one for the association), unless I'm missing something. |
The APIs are not doing two separate fetches. Where did you get that impression from? The documentation shows the raw sql generated by each API and they all use joins: |
I'll try to clarify my point :) This example: Would fetch all books of a specific author. The question is what happens if I want to fetch all authors and their respective books - that is the join piece exactly. Based on the examples I saw in the docs I couldn't find a straightforward example of this. |
That example is just an example of a query that uses the foreign key information to get all books of a specific author, which would have required struct BookInfo: FetchableRecord, Codable {
let book: Book
let author: Author?
}
let request = Book.including(optional: Book.author)
let bookInfos = BookInfo.fetchAll(db, request) For more details, see the section titled Fetching Values from Associations: |
Yes but this is exactly the structure I had a problem with - |
For me, this fits my mental model of SQL very well: there is an intermediary object because it represents every row of the request result. Its a solution that evolves well whatever the kind of JOIN result you have, even to multiple joins and many-to-many relationships. A solution which would store books in a |
So I guess we are in a disagree about this - even if it fits your mental model of this, this isn't how SQL Joins look like usually. A SQL Join would usually provide an object with its foreign-keyed objects, without any intermediary object. Using an intermediary object is very much like a having SQL View (e.g. a virtual table). So it highly depends on what kind of solution you're trying to architect here, but this isn't exactly a join per-se (structure-wise, even though the underlying is a query) |
Imagine a more real-world database where books can be authored by multiple authors (many-to-many relationship). In that case, an SQL JOIN like this: SELECT authors.*, books.*
FROM authors
INNER JOIN authorBooks ON authorBooks.authorId = authors.id
INNER JOIN books ON books.id = authorBooks.bookId couldn't be represented without an intermediate object. For an "expert" API like GRDB joins, I think we need to embrace that joins are complicated beasts and provide an API that allows maximum flexibility. |
The question is- are we trying to solve the 10% or the 90% ? Would you agree a double inner-join isn't that common of a use case? What I'm asking about is: SELECT authors.*, books.*
FROM authors
INNER JOIN books ON books.authorId = authors.id E.g. an author and his book |
I think its very personal, but my joins are never that simple. |
And that's absolutely fine - but I can honestly tell you the majority of people would want an object, and its directly-relationed objects with it, and not a more complex join necessarily, so I think its definitely possible to model a solution that would allow for the 90% to have a more concise syntax, while the more complex issues can be solved with an intermediary object. But using an intermediary object for this basic scenario is quite the bummer TBH. |
Hello Shai, thanks for your feedback (thanks David, too)! Let's answer your questions!!
You have seen that Author.books is a static propery of Author (it contains an association). There is no I have been wondering if requesting all books from an author had to be done from an author (give me all books given this association), or from the association (give me all books given this author). Since other ORMs usually load associated records by the mean of methods on the source record, it's been my choice too: // Editor's pick: Fetch books from author
let books = author.fetchAll(db, Author.books)
// Discarded: Fetch books from the association:
let books = Author.books.fetchAll(db, from: author) Now fetching methods are not enough. We also need requests, which can feed database observation tools. That's why we also have: let request = author.request(Author.books)
request.rx.fetchAll(in: dbQueue).subscribe { books in
print("new books: \(books)")
} So: And the documentation on this topic has to be improved, because it mixes requests (advanced feature) and fetching methods (that everybody needs) on the same level.
Well spotted, and thanks. Yes, "hasOne", "hasMany", "belongsTo" have the nice historical log of abstractions that are well understood by developers.
It is described here: "This request fetches all books, including their author's information. Books that don't have any author (a null authorId column) are fetched as well." I'm not sure I understand your question, which may mean that I totally failed at explaining the four joining methods, and that you built a mental model of them that is at odds with their actual behavior. No offense, this happens all the time when APIs are not clear enough. I don't know if would still ask this same question now. If you do, would you be able to ask it again, but with different words so that I can figure out what's in your mind, and give a proper answer?
Excellent question. This deserves a new entry in the Future Directions paragraph. No. Today, there is no API to load all authors will all their books, and you'd have to do it manually. I had the gut feeling that it was not a feature required for an initial release.
1000 points question again. Not only are those third data types the very reason why we can ship this joining API. But they also are a very recommendable way to consume database values. The previous attempt at providing an API that generates joins would not use those "third data types". It would return tuples instead. It did address, for example, the "all authors will all their books" request by returning But without third data types, we have big issues:
And "Third-party types" have another benefit (because I have a hidden agenda): by fostering the decoding of joined requests into third-party types, I grant GRDB users with data consistency for free. And the topic of lazy loading of associated records vanishes. What's better than this "BookInfo" type that contains exactly what I need? Why feed it manually with several db requests performed at various times in my app when I can get it for free? As David said:
That's exactly my expectation. You later asked:
Unless I'm mistaken, such a requests needs two SQL queries, not a single one (please correct me if I'm wrong): SELECT * FROM authors WHERE ...
SELECT * FROM books WHERE authorId IN (...) Such requests can currently not be expressed with the I was expecting that future additions would add another protocol on top of This makes me think that I may have to rename You again:
I know that you like model objects to have properties to their associated objects. You call that "modelling", and it's a fair habit that is widely supported by many years of software design, ORM APIs, accross many years and thousands of lines of code: class Book {
var title: String
var author: Author
} I want to break this habit 🤘 It's OK when your models are managed, uniqued, and lazy-loaded. GRDB records are much more low-level: a fetched record is an in-memory cache of database values. It can be a plain Swift struct. That's why we need cache invalidation, and why I put so much energy on providing easy-to-use database observation techniques :-) This means that the above Book class with its author property, when it exists, belongs to the application. The application is sole responsible of its management (which is not easy). By ditching uniquing, lazy loading, and not managing records, GRDB was able to simplify multi-threaded code. The cost is that records have I wholeheartedly understand your point. And I jump with passion in the opposite direction, with many many types focused on precise tasks. My personal experience (which guides me a lot) shows that those types are often private types inside view controllers (or view models or whatever). Add I like to focus on a local and isolated type, instead of bloating a unique model type with convenience methods used by a single screen of an app.
Very good idea. There are so many new concepts in this doc!
As David has noticed, it's surely a typo. I have tried to be consistent (with the only exception of the "demographics" table which is for me the needle that reminds me that something is not "extra clear" here). I deeply regret this "singular table" convention. I really want to remove it. There it lies: let author = Book.belongsTo(Author.self)
let request = Book.including(required: author)
let row = try Row.fetchOne(db, request)!
print(row.debugDescription)
// ▿ [id:1, authorId:2, title:"Moby-Dick"]
// unadapted: [id:1, authorId:2, title:"Moby-Dick", id:2, name:"Herman Melville"]
// - author: [id:2, name:"Herman Melville"] You see the This key comes from the association That association has not much anything to chew on: the Book type (irrelevant), and the Author type, which comes with its I've tried to build "author" from the Author type itself, by playing with To remove the "singular table" convention, we just need to be able to generate "author" from the Author type.
Right. This "namespace" enum has no purpose, and pollutes the doc.
As David said above, because foreign keys can span several columns. About the "foreign key" name. It's hard-core, very low-level. I kept it because the two types at both ends of an association use the same foreign keys so disambiguiate their own associations: struct Book: TableRecord {
// Define foreign keys
static let authorForeignKey = ForeignKey(["authorId"]))
static let translatorForeignKey = ForeignKey(["translatorId"]))
// Use foreign keys to define associations:
static let author = belongsTo(Person.self, using: authorForeignKey)
static let translator = belongsTo(Person.self, using: translatorForeignKey)
}
struct Person: TableRecord {
static let writtenBooks = hasMany(Book.self, using: Book.authorForeignKey)
static let translatedBooks = hasMany(Book.self, using: Book.translatorForeignKey)
} I agree that this is clearly not the most elegant part of the API. Thanks for your feedback so far! I already a nice list of improvements to do, and I hope I could answer your questions and concerns, even if it may look like we disagree sometimes. |
I've given my point of view above, but yours is still interesting. Do would have an idea of what it would look like (the definition of two book/author types, and the ideal API that loads what you need)? |
Given your explanation on top I don't think you'd really want to do this :) I would imagine for a behavior that causes the "secondary" object to be fetched along with the first without an intermediary object, but you've fully explained your stance on it and I'm fully OK with it. I don't think I have any serious beef with it at this point, it's just a API design choice. |
@freak4pc I'm happy we understand each other. Now if I write bold words and use the 🤘 emoji, this does not mean that I don't want to listen (although heavy rock music can damage ears). Maybe I can smooth your concerns by telling that supporting classic modeling is just an class Book: FetchableRecord {
static let author = belongsTo(Author.self)
var title: String
lazy var author: Author { /* fetch the author if not loaded yet */ }
private var authorId: Int64 // support for the author lazy property
init(row: Row) {
self.title = row["title"]
self.authorId = row["authorId"]
if let authorRow = row.scoped(on: "author") {
self.author = Author(row: authorRow)
}
}
}
let books = try Book.including(required: Book.author).fetchAll(db)
for book in books {
print("\(book.title) by \(book.author.name)")
} I just can't really foster such pattern in the documentation, because lazy loading of associated records breaks the data consistency guarantee (and I morally can't give dangerous advice). It's a little too smart, you see? |
Yeah I'm aware of the lazy method but I can get this done without this PR, so I didn't even mention it :) No worries, I totally what your architectural standpoint is and it's 100% good! 💯 |
For information, some tests fail because the various tested SQLite versions don't all agree on the As far as I can see, tests of database regions have to be adapted, but they don't reveal any observer breakage (although I'll have to add a few integration tests that lift remaining doubts). |
I did start using @freak4pc's feedback:
After a good second read, I find your idea stellar. The Requesting Associated Records chapter has been rewritten accordingly.
The Joining Methods chapter has been fully rewritten, and I hope it's much more clear now. |
Good idea, done.
Yes. I have removed the surprising enum in the Foreign Keys chapter. |
Wooo! 🎉 Love this new API ! Looking forward :) Also documentation updates look awesome. |
2c1705d
to
39c5313
Compare
…on.RightAssociated
Rename AssociationMapping to JoinCondition Replace AssociationMappingRequest enum with ForeignKeyJoinConditionRequest struct Rename AssociationChainOperator to AssociationJoinOperator, and "chain" to "join"
…to avoid AnyFetchRequest contamination
…e SQLTableQualifier into TableAlias.
This pull request enhances the query interface so that it can build and consume some joined requests. There is much room for future enhancements, but we have a shippable API already.
Documentation: GRDB Associations
@hartbit, @smhk, @Sroka, @cfilipov, @hdlj, @mtissington, @pakko972, @palpatim, @pierlo, @schveiguy, and @swiftlyfalling, you have shown your interest in GRDB in general and sometimes on join requests in particular: I'd be glad if you would have a look at the proposed documentation. I'll be happy answering your questions, addressing your concerns, and generally improve this PR with your help.