-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joins in the Query Interface #176
Comments
Ha, joins 🦄! That's a uneasy topic. May I suggest a reading list?
Bring in your imagination, @hartbit, because it's more a design topic than a technical one! |
I've added all those articles to my reading list. Will come back here once I've had time to digest them all. |
What do you mean by the "syntax-based approach" of SQLite.swift? Can you explain in more detail what you don't like about it? I'm not a huge fan of it either, but it has the advantage of name-checking column names. |
I mean that SQLite.swift sticks to the SQL syntax, in its attempt at being able to generate as many SQL requests it can. It is a 1 to 1 mapping from Swift to SQL. For example, it forces users to manage table and column aliases in order to avoid ambiguity, just like SQL. It also forces the user to bend his mind around a Swift API that brings nothing to raw SQL. Why invent a new, poorer language, when SQL is already there? For me, it is is nothing but an academic exercice, with no practical value except Swift training. And it is a difficult new language: non trivial SQL queries are difficult to express with SQLite.swift: users have to sweat their Swift code, assuming they manage to write it. Yeah, yeah, name-checking, type safety. Well, when name-checking and type safety prevent me from writing the SQL I need, they'd better back away. You may have noticed that SQL is a first-class citizen here: that's because when the GRDB query builder misses features, it's easy to plug some raw SQL and call it a day. You don't lose anything: fetched rows are still rows, fetched records are still records, etc. A general-purpose join builder that mimicks SQL is not a goal, because it would be plain louzy. Even the status quo (no support for joins) is less louzy. |
Yes, a good design should make it easy impossible to generate ambiguities.
I totally understand, and thats why I love GRDB so much. But being able to express a query with the Query Interface allows a few niceties. For example, I've found it very useful to have I'll continue reading the other material :) |
The failed PR #103 (doc) did achieve this goal. But it was not welcomed by the early adopters who were nice enough to evaluate it (I'm so grateful!). Since then, I really think Diesel has nailed something important, but that's still fuzzy in my mind.
I'm glad you do! Don't miss row adapters because I do think that they can help consuming joined queries (generating them is useless if one can't extract values from the results, isn't it?). |
Yes, I've been thinking about that a lot. But even without them, it's still interesting to be able to use joins to filter rows. In those cases, you wouldn't need a hypothetical "join-row-adapter". But I agree they should come in the picture. |
We could imagine a general extension Request {
// Not tested
func fetchCount(_ db: Database) throws -> Int {
let (statement, _) = try prepare(db)
let sql = statement.sql
return try Int.fetchOne(db, "SELECT COUNT(*) FROM (\(sql))", arguments: statement.arguments)!
}
} Not the most efficient SQL ever, but it would work: func people(in country: Country) -> AnyTypedRequest<Person> { ... }
let request = people(in: france)
let frenchies = try request.fetchAll(db)
let frenchCount = try request.fetchCount(db) |
That sounds like a good thing to do, even if we end up having a join query interface :) |
@groue What did you like in your original approach and would like to keep? What did you not like and would avoid? What did you enjoy in the Diesel approach? Just trying to understand to come up with a solution you have already thought of and already dismissed. |
Some desired features of a future join API:
The list of desired use cases based on associations has still to be established. I need to tell what I like about Diesel first. |
After reading everything, it seems that the only feature that your old solution was missing compared to the points you mentioned above is relationships with cardinality ( While the idea of using row adapters for handling disambiguation is interesting, I keep coming back to the idea that if columns and relations could be configured with the table they correspond to, we could have a simpler API:
If ambiguities come up because the same table is joined multiple times, perhaps we could use aliases. I know it's not very clean as is, but I don't easily think of joins in a hierarchical manner so this structure comes more naturally to me
This is just some initial thoughts. |
PS: The thing I like about attaching the column and relation info with the table is that once Swift gets more powerful features like property behaviours and macros, we should hopefully get those generated for free with a little bit of decoration. |
The API you propose is inspired by #103, isn't it? This PR did not find its public. Do you think that the removal of automatic management of aliases, the extra complexity of the Column type, and the replacement of strings with types, address those concerns (maybe they do)? |
Here's my take on the criticisms of #103: @mtissington says:
This sounds to me like a criticism of the feature itself, not its design or implementation. But I think we agree that a JOIN layer is useful for some people (I would strongly benefit from it). Having to resort to SQL is sometimes necessary, but we loose the many benefits of the query interface: compile-time checking of column and relation names, flexibility to adapt a query to change what is fetched, etc... In @palpatim's answer, he starts with the following paragraph:
This seems to me like he prefers true ORM style libraries which do more for him. But GRDB has clearly distanced itself from them to avoid the famous 1+N problem. In that regard, your implementation was following GRDB philosophy. Nonetheless, he says:
And I think I agree with that. Having to think in terms of hierarchical scopes is fairly different to how the SQL syntax represents them. My mental model has adapted to the way they are expressed in SQL and the hierarchical nature of the scopes got me a little confused. That's why I reverted to the alias solution in my code above. @cfilipov echoes my thoughts:
One of his negative comments about your proposal was:
This caught me off-guard initially also. But once I understand that it existed to quickly He later comments:
I think his remark stems from the same scoping and hierarchical issues I detailed above. But only he can say :) As you see, I have the impression that most criticisms in that post were either about the philosophy behind GRDB (one user asking for less query interfacing, the other for more automation), and the few real criticisms seem the centre around the complexities of the scoping mechanism. So I'd tend to say that the direction you were going in was okay, but perhaps just needs some rework in terms of handling table ambiguities. I hope that the example code I posted above is a potential direction. |
Hello @hartbit
That's really great, because there's a lot of information, desires, and fears there. I won't always agree with you, but I'll try of explain my points of view as clearly as I can.
@mtissington indeed does not need a join DSL. Does it mean that we can ignore his opinion? I'm not sure: I second him when I doubt that a DSL that mimics SQL (importing all its idiosyncrasies such as manual aliases, for example) has any value that goes beyond the joy of completing a funny intership project. You cite "compile-time checking of column and relation names". But this is only a facet of the query interface, and a much overblown one as a matter of fact. Nothing prevents you from using wrong columns with the wrong table, or misnamed columns (easily unnoticed), or full-text operators with non-full-text tables, etc. The world of query-interface mistakes is much wider than its compilation errors. You can avoid some of them with a strong discipline: long lists of static Column constants such as You also quote the "flexibility to adapt a query to change what is fetched". The query interface is indeed based on method chaining: A bigger value of the query interface lies in its ability to talk to developers who aren't familiar with SQL. I added the query interface for one of my co-worker who didn't feel comfortable with SQL. When the trivial #103 did not score well on that scale, as a matter of fact. And the query interface has support for advanced SQL features such as Now I think that spending a lot of energy on a SQL-mimicking DSL (especially for joins) is bound to deceive both advanced developers (who will spend time converting natural SQL into Swift, and face some inevitable blind spots), and beginners (who will remain helpless and miserable due to their inability to build SQL joined queries in their mind). There is a real danger of wasting time here - for you, me, and GRDB users of various skills. I'm sure we can avoid this trap. Let's see what @palpatim has to say:
What he means is that ActiveRecord lets you write: ActiveRecord is also able of Back to @palpatim, who has a valid request: "my recollection is that [ActiveRecord] had a declarative interface that allowed you to map relationships in the model and then not have to think about them too hard at the call site." Those declarations are the relationships I talked about above, and that are well described in ActiveRecord documentation. And those relationships are concepts that fit well the beginner developers I'd like that we'd care about.
I don't follow @palpatim here: the code he criticizes is almost identical to the Ruby
Good ORM libraries avoid the 1+N problem, too. What GRDB doesn't inherit from traditional ORM is lazy loading, record auto-updating, record uniquing (because immutability), and automatic relationships methods (because it can't). #103 also had a strong SQL smell that did repell @palpatim - and I have explained above why I also doubt, now, that sticking to SQL is a valuable option. Of course relationships are fundamentally based on the database schema, and any documentation has to illustrate relationships with database tables (example). But talking about the database schema does not imply forcing SQL down the reader's throat.
Yes. Scopes were the main #103 novelty, and failure. They were used as paths to navigate inside the linear selection of an SQL query (accessing
Isn't this a variation on @mtissington's point? Anyway, you're both right. But I wonder how he maps
Yes.
I agree with your premises. We may need more than a rework.
You must have guessed now that I fear encouraging you to continue in that direction. It's time to turn on some lights, and talk about Diesel. DieselTheir documentation starts with:
You bet! The landscape of database libraries in the context of new languages like Swift and Rust, that both foster immutability and protocol(traits)-oriented programming, is bleeding edge. Realm is so 2010! Being experimental is our moral duty 😉 What's the Diesel proposal? They start with two records: #[derive(Identifiable, Queryable, Associations)]
#[has_many(posts)]
pub struct User {
id: i32,
name: String,
}
#[derive(Identifiable, Queryable, Associations)]
#[belongs_to(User)]
pub struct Post {
id: i32,
user_id: i32,
title: String,
} Users and posts are bound with a has many and a belongs to relationships. These are the only relationships supported by Diesel. Note how Diesel records are similar to GRDB records: columns are supported by properties, and they can be implemented as value types (structs) if the library user wants to. Diesel and GRDB share the same fundamentals about what a database model is. Diesel can load a user's posts: let user = try!(users::find(1).first(&connection));
let posts = Post::belonging_to(&user).load(&connection); So far, so good. Diesel can also load all the (User, Post) pairs: let data: Vec<(User, Post)> = users::table.inner_join(posts::table).load(&connection); Meh. The value of this is limited: the Fortunately, they provide some sample code that builds this fn first_twenty_users_and_their_posts(conn: &PgConnection) -> QueryResult<Vec<(User, Vec<Post>)>> {
let users = try!(users::limit(20).load::<User>(conn));
let posts = try!(Post::belonging_to(&users).load::<Post>(conn));
let grouped_posts = posts.grouped_by(&users);
users.into_iter().zip(grouped_posts).collect()
} And there ends their doc/demo. That doesn't look very sexy, hu? But we don't know what terrible problems they had to deal with. Let's forget about the form of their API, and focus on its susbtance. There lies the value: Diesel loads object graphs into standard collections and tuples of records. Especially:
These ideas are highly inspiring, because they completely avoid whole classes of problems, questions, and extrapolations (feature wishes). No lazy loading. No visible navigation inside the linear database rows. No record methods like Caveat: the number of available join queries has to be limited. Swift or Rust don't compile methods that generate recursive arrays of tuples of arrays of arrays of tuples of arbitrary depth. Static typing is a hard limit. But within those limits, there are plenty of opportunities to provide useful methods that load useful object graphs in efficient ways. Yes, support for object graphs is not the same as support for joins. But there is an overlap. I'd like to see what could be achieved this way! |
A side note about request derivation (the feature that I claim to be too much valued, considering how rarely reusable code is). Sometimes code can be reused. GRDB provides some help for derivation of your custom SQL requests, thanks to the
// func fetchOne(_ db: Database, _ sql: String, arguments: StatementArguments? = nil)
User.fetchOne(db, "SELECT * FROM users WHERE id = ?", arguments: [1])
User.fetchOne(db, "SELECT * FROM users WHERE id = :id", arguments: ["id": 1]) Positional ( This allows custom requests to build several queries that reuse a common SQL template, and merge various arguments together: struct MyRecord: RowConvertible {
static func request1(...) -> AnyTypedRequest<MyRecord> {
return makeRequest("...", arguments: [...])
}
static func request2(...) -> AnyTypedRequest<MyRecord> {
return makeRequest("...", arguments: [...])
}
static func request3(...) -> AnyTypedRequest<MyRecord> {
return makeRequest("...", arguments: [...])
}
private static func makeRequest(_ sqlSuffix: String, arguments suffixArguments: StatementArguments? = nil) -> AnyTypedRequest<MyRecord> {
// Base sql and arguments
var sql = "SELECT ... FROM ... "
var arguments: StatementArguments = [...]
// Append suffix sql and eventual suffix arguments
sql += sqlSuffix
if let suffixArguments = suffixArguments {
arguments += suffixArguments
}
// Build final request:
return SQLRequest(sql, arguments: arguments).bound(to: MyRecord.self)
}
}
try MyRecord.request1(...).fetch...
try MyRecord.request2(...).fetch...
try MyRecord.request3(...).fetch... |
I think we agree more than you think :)
I think its a question of degree. Mimicking SQL down to the syntax is not necessarily useful. As you pointed out at the beginning of the discussion, "This approach [...] requires SQL knowledge (one has to build the SQL join in one's head before "translating" it into Swift), and yet is less readable than SQL.". But when the layer adds useful features on top of a layer of convenience, like the current Query Interface does, it brings value to both experts and newcomers. I think its a question of finding the right balance.
I agree with you that the world of mistakes is much wider. But I find compile time checking of names very useful when I work, and for more than just avoiding typos. I can make a typo in my stringl-y typed SQL or use the wrong column in my query interface and in both cases I will notice it pretty quickly and fix it. But where it really shines is when I'm reworking my code. Imagine that I rename or remove a field in a
I know that this is not ideal. But it seems like something that can and should be improved when Swift gets more metaprograming/reflection features. For example, if Column constants could be automatically generated (I already do this with Sourcery by the way), we could imagine an API like this:
where the argument passed is of type
Reusable code between projects is rare indeed, and thats where I apply the Rule of Three. But inside a project, I often have the need to reuse some queries. For example:
It's definitely useful for those users. I just have a strong feeling that it also very valuable to developers more familiar with SQL. But perhaps I am simply projecting my own uses of the library.
Yes, that's what I undestood from what he said. And I also agree and understand why GRDB can't follow this.
Side discussion concerning immutability. Is it's a strong goal of GRDB, why does
Don't worry. It was just an idea to spur discussion :)
What do you mean by that?
I totally agree with this direction. Getting a
Me too! Btw, haven't had time to read your last comment. Will loot at it this evening. |
I'm glad we do! We have both exposed some options in the most radical way, and now we can focus on "finding the right balance", as you say. Can we agree on a new wording for our goal? Codename "Graph API": an API that loads object graphs built from relationships between record types. Documentation will introduce relationships between record types based on patterns of database tables, and expose how GRDB lets the user fetch a set of graphs of objects. The SQL point of view on the Graph API will be introduced as an extra, for advanced users. Compilation ChecksI've tried to show that this topic is over-pushed, and you have replied with valid arguments considering your coding practices. Those two points of view reflect two ways to use GRDB: one that writes as little code as possible (with caveats), and one that goes as far as using Sourcery in order to generate a heavy load of boilerplate compiler constants that help trigger some compiler errors in case of refactoring. Until now, GRDB has been able to welcome both practices. And there is no Unlike #103, the Graph API will be heavily Record-based. New protocols will possibly be introduced, with new constraints on adopting types. Yet you know that I'll look at boilerplate with a critic eye. I don't think that the Graph API will need much declarations beyond individual relationships: struct Book {
static let author = BelongsTo(Author.self, ...)
}
struct Author {
static let books = HasMany(Book.self, ...)
}
// [(Author, [Books])]
let graph = try Author.with(Author.books).fetchAll(db) And I'll leave Request RefinementYou're right, request refinement is useful. I was trying to say that it is not its major feature, and that we can live with Graph requests that can not be refined. For example, how do you refine the filtering of books in We can keep request refinement in a corner of our head, but it must not jeopardize the legibility of the Graph API. Beginner Developers
I have started gathering all the JOIN requests that exist in the Immutability
GRDB never mutates your records on your back. Record is a class because it needs private storage for changes tracking. And it behaves like a value type if all properties are I think I didn't understand your question. Diesel
I mean that the extraction of
That sentence of yours sounds like honey to me. I'm sure we'll build something valuable! |
And it's great that GRDB allows you to work both ways, even in same project, with very little friction.
Sounds like a great idea :)
I think that as long as there is an easy way to filter joins (as in your example above), I think we can live without the refinement feature.
I'll look into mine too. Once an API starts emerging, I'll look through them to see which can and can't be implemented with the Graph API.
Yes, of course :)
I didn't express my point well enough. What I meant is that, because I use Record, and because its a class, I've already experienced bugs where I had manipulated a Record in parts of my application and another part of the application holding to the same reference got very confused by that :) It would not have happened with structs. My point is that Record is so prominently advertised in the documentation that it almost feels like the natural type to use. I think it might be worth spending more time advertising the use of value types as you did in your blog post.
Where do we go from here? Some more questions?
|
Yes @hartbit. That'll help us evaluate the range of use cases that we need/wish to support.
I see what you mean. Until now, Record has served me well, but I have faced troubles as well, and my GRDB best practices are still in flux ;-) My current bottom line is: the database is the only source of truth. Any fetched record may quicky drift away from the truth since it is essentially a cached value. For example, as a screen is displaying some user's info, a background web request may update the user in the database: the displayed user record may be outdated. As stated in Four different ways to handle SQLite concurrency, such conflicts must be handled by the application (and a framework like Core Data makes conflict handling implicit and hard to deal with - while GRDB prefers to make it explicit, hoping that particular conflicts are easier to deal with than general ones). That's why a read-only screen may use a FetchedRecordsController, and get automatic updates. A read/write screen may use the All those ideas are difficult to document and foster (assuming they are actual best practices)! And I'm not familiar enough with other application architectures like reactive programming and MVVM to know how they could reframe the question. Meanwhile Record is the simpler type. But you're right, the Record introduction could have a paragraph that targets advanced users and talks about the extra advantages of structs + protocols (without introducing FUD and making
I'd like to reframe those questions: since the Graph API will be documented as a way to fetch record graphs from the database, not as a way to perform joins, we'll implement the SQL needed to implement the graph use cases. Not the other way around. Our knowledge of SQL will be used, but it should not drive the use cases. You're asking what's next, here are some ideas:
That's at least what I intend to do. Does it look sensible to you? |
CardinalitiesIt's worth detailing relationships and the cardinalities guaranteed by the database, that is to say how many objects are involved in both sides of each relationship. Formal cardinalities (in data modelling) and concrete ones (as implemented by a relational database) do not exactly overlap. For example, "at least one" is not easily expressed in SQL. I'll focus on concrete ones because they map well with the Swift type system. There are only three concrete relational cardinalities:
|
RelationshipsNow that cardinalities have been detailed, we have a better idea of the Swift relationships we can express:
We have two variants of belongs to, based on the eventual nullability of the foreign key: an optional one and a non-optional one. Should both variants be provided in Swift, or should we only provide the optional one? This leads to a broader question which is the tolerance for loose schemas. A loose database database schema does not use database constraints to enforce the intent of the programmer. I personally like strict schemas, and GRDB can be rigid sometimes. For example, look at that quote from the fatal errors documentation:
But the Graph API has to be much more tolerant, because beginner developers are an important target. And a beginner developer, by definition, is not expert in relational constraints. We'll have to evaluate the consequences of loose schemas on our APIs: some bugs can be tolerated (such as the consequences of a missing unique index in a has one relationship), and some bugs should be carefully evaluated (data loss opportunities, and code that trumps the reader (such as the Distinguishing between belongs to (optional) and belongs to (mandatory) allows to load optional or non-optional parents, but makes the Swift API more complex, and more exposed to bugs introduced by loose schemas. Supporting only one variant, belongs to (optional), has the advantage of simplifying the API, and supports more loose schemas. Finally, a single belongs to (optional) relation can still be lately used as a mandatory one (when the user declares an optional relationships, and later uses it to load non-optional parents). This makes me think that here is our minimal list of relationships:
Some domains overlap, but differences remain in the declaration of relationships: Belongs to and has one have the same domain, but the foreign key is in the table at the origin of the belongs to relationship. Has many and Has and belongs to many have the same domain, but the latter involves an intermediate table. Maybe we'll reintroduce belongs to (mandatory) later, when we evaluate our Swift API proposals. |
Take your time, @hartbit.
I'll try to keep a single belongs to (optional), and push the choice between optional/non-optional until usage (as
Nothing against it! But I don't know how to efficiently specify the type of the extracted value. The lack of typed expression is at play here: GRDB doesn't know the type of columns and derived expressions, because it promotes late conversion (as we have discussed in #183). We don't know the type of values returned by The rationale against typed expressions (fostering record properties as the source of type safety instead of boilerplate lists of column declarations + robustness against oddities in the dynamic typing of values in SQLite) is subtle. My naive goal is that one does not actually miss typed expressions. But I can't prevent people from thinking they do. The peer pressure for Swift type cathedrals is very high these days. And here we actually do miss them, since graphs are not backed by record properties. We could have a middle ground: // All persons with the year of their most recent book
// [(Person, Int)]
let authoredBookYear = Person.authoredBooks["year"] // or ...[Column("year")]
let graph = Person.annotated(with: max(authoredBookYear).as(Int.self)) I'm not sure. |
Yeah I was thinking of that last example. |
Yes. It closes for good the opportunity for derived records, but it's a choice we could live with. |
Why would it necessarily close the opportunity? It's still a viable option for who wants to do that, no? |
I want to quickly familiarize myself with GRDB, so as to contribute to this fantastic thinking :) |
Everything has a start! https://github.com/groue/GRDB.swift/tree/Graph has preliminary support for "has many" associations. First declare records and their associations: struct Child : TableMapping, RowConvertible {
static let databaseTableName = "children"
let id: Int64
let parentId: Int64
let name: String
init(row: Row) { ... }
}
struct Parent : TableMapping, RowConvertible {
static let databaseTableName = "parents"
// Associations:
static let children = hasMany(Child.self)
static let filteredChildren = children.filter(Column("name") == "foo")
let id: Int64
let name: String
init(row: Row) { ... }
} Setup the database with foreign keys so that GRDB can infer involved columns: let dbQueue: DatabaseQueue = ...
try dbQueue.inDatabase { db in
try db.create(table: "parents") { t in
t.column("id", .integer).primaryKey()
t.column("name", .text)
}
try db.create(table: "children") { t in
t.column("id", .integer).primaryKey()
t.column("parentId", .integer).references("parents") // <-- there
t.column("name", .text)
}
} And now you can use the associations: try dbQueue.inDatabase { db in
// All parents with their children
let graphRequest = try Parent.including(Parent.children)
// Altered children
let orderedChildren = Parent.children.order(Column("name"))
let graphRequest = try Parent.including(orderedChildren)
let graphRequest = try Parent.including(Parent.filteredChildren)
// Altered parents
let graphRequest = try Parent
.filter(Column("name") == "foo") // Alteration before `including`
.including(Parent.children)
let graphRequest = try Parent
.including(Parent.children)
.filter(Column("name") == "foo") // Alteration after `including`
// The graph variable is of type [(Parent, [Child])]
// Two SQL queries are executed:
// SELECT * FROM parents ...
// SELECT * FROM children WHERE parentID IN (...) ....
let graph = try graphRequest.fetchAll(db)
for (parent, children) in try graph {
...
}
} |
Implemented so far: // BelongsTo
struct Book {
static let author = belongsTo(Author.self)
let authorId: Int64?
...
}
struct Author {
...
}
let oldBooks = Book.filter(Column("year") < 2000)
let orderedBooks = Book.order(Column("title")
let twentiethAuthor = Book.author.filter(Column("birthYear") >= 1900)
let graph: [(Book, Author)] = try Book.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try oldBooks.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try orderedBooks.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try Book.joined(with: twentiethAuthor).fetchAll(db)
let graph: [(Book, Author?)] = try Book.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try oldBooks.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try orderedBooks.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try Book.leftJoined(with: twentiethAuthor).fetchAll(db)
let book: Book = ...
let author: Author? = book.fetchOne(db, Book.author)
// HasOne
struct Country {
static let profile = hasOne(CountryProfile.self)
...
}
struct CountryProfile {
let countryId: Int64?
...
}
let europeanCountries = Country.filter(Column("continent") == "Europe")
let orderedCountries = Country.order(Column("name")
let euroCurrencyProfile = Country.profile.filter(Column("currency") == "EUR")
let graph: [(Country, CountryProfile)] = try Country.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try europeanCountries.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try orderedCountries.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try Country.joined(with: euroCurrencyProfile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try Country.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try europeanCountries.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try orderedCountries.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try Country.leftJoined(with: euroCurrencyProfile).fetchAll(db)
let country: Country = ...
let profile: CountryProfile? = country.fetchOne(db, Country.profile)
// HasMany
struct Author {
static let books = hasMany(Book.self)
...
}
struct Book {
let authorId: Int64?
...
}
let twentiethAuthors = Author.filter(Column("birthYear") >= 1900)
let orderedAuthors = Author.order(Column("name")
let oldBooks = Author.books.filter(Column("year") < 2000)
let graph: [(Author, Book)] = try Author.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try twentiethAuthors.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try orderedAuthors.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try Author.joined(with: oldBooks).fetchAll(db)
let graph: [(Author, Book?)] = try Author.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try twentiethAuthors.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try orderedAuthors.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try Author.leftJoined(with: oldBooks).fetchAll(db)
let graph: [(Author, [Book])] = try Author.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try twentiethAuthors.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try orderedAuthors.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try Author.including(oldBooks).fetchAll(db)
let graph: [(Author, Int)] = try Author.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try twentiethAuthors.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try orderedAuthors.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try Author.annotated(with: oldBooks.count).fetchAll(db)
let author: Author = ...
let books: [Book] = author.fetchAll(db, Author.books)
let books: DatabaseCursor<Book> = author.fetchCursor(db, Author.books)
let book: Book? = author.fetchOne(db, Author.books) |
Very exciting. |
Can't wait to play with that. Will do soon in my next project. |
struct Author {
static let books = hasMany(Book.self)
...
}
struct Book {
let authorId: Int64?
...
}
let authors: [Author] = try Author.filter(Author.books.isEmpty).fetchAll(db)
let authors: [Author] = try Author.filter(!Author.books.isEmpty).fetchAll(db)
let authors: [Author] = try Author.filter(Author.books.count == 1).fetchAll(db) // ==, !=, <, <=, >, >=
let currentlyActiveAuthors: [Author] = try Author
.filter(Author.books.filter(Column("year") > 2010).count > 0)
.order(Column("name"))
.fetchAll(db) When you compose two struct Person {
...
}
struct Citizenship {
static let person = belongsTo(Person.self)
let countryCode: String
let personId: Int64
...
}
struct Country {
static let citizenships = hasMany(Citizenship.self)
static let citizens = hasMany(Citizenship.person, through: citizenships) // <- HasManyThroughAssociation
...
} HasManyThroughAssociation has the same API has the regular two-tables HasManyAssociation: let graph: [(Country, Person)] = try Country.joined(with: Country.citizens).fetchAll(db)
let graph: [(Country, Person?)] = try Country.leftJoined(with: Country.citizens).fetchAll(db)
let graph: [(Country, [Person])] = try Country.including(Country.citizens).fetchAll(db)
let graph: [(Country, Int)] = try Country.annotated(with: Country.citizens.count).fetchAll(db)
let countries: [Country] = try Country.filter(Country.citizens.isEmpty).fetchAll(db)
let persons: [Person] = try country.fetchAll(db, Country.citizens) |
Hi @groue! I'm going to start a new project soon and plan on using GRDB.swift again. What is the state of the Graph API? I would love to use it, but I'm a bit worried depending on it for a production application if its on an experimental branch that is lagging behind master. Anything I can do to help? |
Hello @hartbit, glad to hear from you!
Couldn't be more honored.
It's in great shape, but it's very lacking in terms of documentation, and it hasn't been rebased since v2.0.3 (when we're at v2.4.0 already). I dislike using tests as doc, but today this is the only available material (files that contain HasOne, HasMany, or BelongsTo in their names inside https://github.com/groue/GRDB.swift/tree/Graph/Tests/GRDBTests). Now the bad news: I'm willing to kill this experiment, and not release it. I have been trying to write a rationale and share it in this issue, but couldn't shape my ideas well enough. Now your question kicks me out of the closet:
To put it in another way: the Graph API had me think more about the topic of object graphs, until I now see it as a wrong solution to a badly expressed problem :-) |
Hi guys, Thank you for really thinking it through. The topic is extremely non-trivial and I really hate how most ORMs handle that. I am working as both Android and iOS developer and recently in Android we got official google library to handle SQLite. I find it quite decent and you could take a look at how they handle Relations, maybe it will give you some new ideas: May I ask how do you handle many to one and one to many relations right now with GRDB? Are you making multiple calls to database or make join and then segregate data? |
Hello @Sroka, thanks for the link to the Android relations. There is indeed an overlap with the Graph API.
You mean, how do I handle them in my applications?
That's my current life with GRDB. Without any Graph API, I unfortunately have to write dedicated code. Without the Graph API, I enjoy the fact that both trivial and complex cases look almost the same, and that when a trivial case turns out to be not so trivial, there is not much refactoring or mental shift involved. That last point is at the core of my embarrassment publishing the Graph API, an API that does not scale well with application complexity. |
@groue Thank you very much for detailed explanation. I must say that I really like this conservative approach. While being verbose (and as someone with Java background I am quite used to verbosity :P ) it is easy to tell what will be the result of a query. If it comes to Graph API it still looks nice and could be used for simpler cases. As an alternative, if someone is using RxGRDB, results of joins can be easily aggregated with groupBy operator |
Yes. But as I told above, there is a huge gap between simple cases and just-slightly-less-simple cases. And it's very difficult to know, in advance, if your case is simple enough or not. Given a task, it can easily happen that you invest time feeding the Graph API with association declarations, only to eventually find out that this is a dead-end. There are two ways out of this dead-end: stop using the Graph API, or extending the Graph API with the missing features. Extending the Graph API is not only difficult in terms of implementation (but I'm responsible for that because of the current state of the code), but also excruciatingly difficult in terms of consistency. Does the extension support filtering? ordering? grouping? counting? recursive associations? left and right joins? support for raw sql snippets? custom table aliases required by some raw sql snippets? natural combination with other Graph API feature? To make you understand this last point, consider just a few of the many features you get with a simple hasMany relation: struct Book: TableMapping, RowConvertible {
static let databaseTableName = "books"
}
struct Author: TableMapping, RowConvertible {
static let databaseTableName = "authors"
static let books = hasMany(Book.self)
} // SELECT * FROM \"authors\";
// SELECT * FROM \"books\" WHERE (\"authorId\" IN (...))
//
// Returns [(left: Author, right: [Book])]
Author
.including(Author.books)
.fetchAll(db) // SELECT "authors".*, COUNT("books"."id")
// FROM "authors"
// LEFT JOIN "books" ON ("books"."authorId" = "authors"."id")
// GROUP BY "authors"."id"
//
// Returns [(left: Author, right: Int)]
Author
.annotated(with: Author.books.count)
.fetchAll(db) // SELECT "authors".*
// FROM "authors"
// LEFT JOIN "books" ON ("books"."authorId" = "authors"."id")
// GROUP BY "authors"."id"
// HAVING (COUNT("books"."id") > 0)
//
// Returns [Author]
Author
.filter(Author.books.count > 0)
.fetchAll(db) And that's only a tiny tiny window on the hasMany association. A combinatorial explosion happens when you combine Graph API features together. The current belongsTo, hasMany, hasOne, hasManyThrough and hasOneThrough associations are covered by hundreds of tests (ok, maybe not, but really a lot) that check a heck a lot of combinations, details, and corner cases. This means that I don't expect any future PR to sustain this level of consistency. Consistency is a quality that is mandatory whenever you hope that users can build a regular and simple mental model of any API. There are reasons why some people like GRDB, and the fact that it appears easy to grasp is one of the key reasons. The fact that it looks easy is a consequence of its inner consistency, which allows exploring rooms where expected doors are there, open and ready to be crossed.
I'm not sure I understand. Would you care to elaborate? |
By the way, this is the second failed attempt at dealing with joins (previous was #103). I'm amazed by the level of maturity required by this topic 😅, and also by the amount of work (code + documentation) I need until a solution eventually appears as a dead-end. I don't know yet how next attempt will look like! |
I can't refrain from feeling a little bit disappointed that we have reached a dead-end for now. But I'm reassured that you're taking the time to make sure we end up with something that is well designed and thought out. 👍 Let me make a few remarks:
This already happens to me today, but with the current Query interface. I start using the Query interface because I need a simple select, for example a list of authors. Then the design team comes to me asking that we add the number of books from the author on the cell, and I have reached a dead end that forces me to stop using the Query interface and revert to pure SQL. Perhaps the solution is to accept that the join API should not handle all the cases/complexity. But if it could handle some of the simple cases, it would drastically reduce the number of times I have to resort to pure SQL. And while I love using SQL, and I admit that its the only solution for truly complicated queries, the loss of compile-time checking for simple cases feels a pity. To give you an idea, I've taken the time to categories all custom SQL queries I do in my latest big app and tried to guess which are simple enough that they could be handled by a simple API. I've renamed sensitive table and field names for privacy reasons, but apart from that, those are real queries I use. Once we everything up, a simple API like the one you have built would reduce my number of custom queries from 26 to 4, which would be huge for me. Please let me know what you think. Annotation predicates (15 queries)Most of my custom queries are those. I need to join only to go test a predicate of a child/grand-child table. For example: // SELECT reports.*
// FROM reports
// INNER JOIN teams ON teams.reportId = reports.id
// INNER JOIN teamMembers ON teamMembers.teamId = teams.id
// WHERE reports.isDeleted = 0
// AND reports.projectId = :projectId
// AND teamMembers.employeeId = :employeeId
// GROUP BY reports.id
//
// Returns [Report]
Report
.filter(!Report.isDeleted && Report.projectId == projectId)
.filter(Report.teams.teamMembers.employeedId == employeeId)
.fetchAll(db) ❓ Question: Is the example above simple to support? We need to traverse two joins to get to the field which the filter predicate is based on. Annotation requests (7 queries)I have a bit less of those. For example: // SELECT bookId, max(chapters.lastUpdated) AS lastUpdated
// FROM books
// LEFT JOIN chapters ON books.id = chapters.bookId
// GROUP BY bookId
//
// Returns [(Int, Date)]
Book
.select(Book.id)
.annotated(with: max(Book.chapters.lastUpdated))
.fetchAll(db) Complex queries (4 queries)Those would be a mess to do without raw SQL. I'm okay with that. For example: SELECT
libraries.*,
books.*,
group_concat(DISTINCT authors.name) AS authorNames,
count(chapters.id OR
preludes.id OR
appendixes.id) AS totalChapterCount,
IFNULL(books.id, RANDOM()) as groupId
FROM libraries
LEFT JOIN books ON libraries.id = books.libraryId
LEFT JOIN bookAuthors ON books.id = bookAuthors.bookId
LEFT JOIN authors ON bookAuthors.authorId = authors.id
LEFT JOIN chapters ON books.id = chapters.chapterId
LEFT JOIN preludes ON books.id = preludes.bookId
LEFT JOIN appendixes ON books.id = appendixes.bookId
WHERE libraries.id = :libraryId
GROUP BY groupId |
(It's funny that this all started with the humble phrase "Joins in the Query Interface" !)
Just one man's opinion: I completely agree with Groue. I am pleased that the 'graph' experiment has been set aside. It's too conceptually messy - I would go further and say: it is a poorly founded idea - to try to map the almost natural-language like flexibility of SQL joins, to the structure of GRDB. I would phrase the above more simply, "Extending the Graph API is difficult in implementation, but also difficult conceptually." It's just not an elegant or well-founded "thing". The overall graph experiment was (I think) an investigation in to that idea: it was not "version one of a solution". So where I mention above, "it is a poorly founded idea", the fuller expression there is; the whole topic (of "incorporating broader SQL") is so subtle and complex, it took this much investigation to learn what we now know: its' a poorly founded idea that will never be consistent and conceptually sound (setting aside implementation difficulties, and that's a big one!) (Thus for example, hartbit, looking at your three categories of example sql there (15/7/4), I personally don't really see anything that immediately jumps out which suggests - let us put it this way - weakening the GRDB concept by adding "a little" special handling.) (For example, the slightest change in the business needs of those queries, would wildly change the "categories" you know? I don't really see a real, safe, meaningful "simpler set" - one view!) (Interestingly, I personally came to GRDB for the specific reason that you can let us say inherently and naturally use "full sql when you need to" in GRDB; the other older libraries fall-down in this. To me - again it can only be one man's opinion - it would be a false/inelegant goal for a Swift SQL library to think that all sql can be re-presented.) |
A few counter-arguments:
I think there is a big difference between the goals of mapping the whole flexibility of the SQL syntax and mapping only the simple cases. That's the whole reason the Query interface exists today: it's a nicer interface for the simple cases. There are two extremes: no interface whatsoever, always full raw SQL, which is rather inconvenient, and at the opposite extreme, trying to map the whole SQL language, which would make for a complicated mess. All this to say that I don't disagree with the premise, that mapping the flexibility of joins is too messy, but that we don't need to map the full flexibility of joins to support some fairly common use cases.
I have the same problem today when I start with a simple Query interface request and have to go one level up in complexity, forcing me to rewrite the query in raw SQL. I'm just advocating for the barrier to be slightly higher and support a few common join operations.
I also came to GRDB for the power to express "full sql when you need to". I think we just disagree on when we need to 😄 |
Hello folks, here is some update! Since building joined queries is so delicate, I've changed my focus a little bit, on the consumption of joined queries. https://github.com/groue/GRDB.swift/blob/development/README.md#joined-queries-support I don't expect those APIs to be very controversial, since they extend existing GRDB techniques, and attempts at standardizing a safe way to deal with complex SQL queries. Ha, and we can now write nested Decodable records 😄 |
Obsoleted by #319 |
I wouldn't mind working on joins for the query interface :) Do you have any thoughts on how you would have liked the feature to be designed? It's technically non-obvious because as soon as the query concerns multiple tables, you need columns to be referenced by the table name to avoid all ambiguity.
The text was updated successfully, but these errors were encountered: