Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Joins in the Query Interface #176

Closed
hartbit opened this issue Feb 25, 2017 · 50 comments
Closed

Joins in the Query Interface #176

hartbit opened this issue Feb 25, 2017 · 50 comments

Comments

@hartbit
Copy link
Contributor

hartbit commented Feb 25, 2017

I wouldn't mind working on joins for the query interface :) Do you have any thoughts on how you would have liked the feature to be designed? It's technically non-obvious because as soon as the query concerns multiple tables, you need columns to be referenced by the table name to avoid all ambiguity.

@groue
Copy link
Owner

groue commented Feb 26, 2017

Ha, joins 🦄!

That's a uneasy topic. May I suggest a reading list?

  • I'm not interested in a syntax-based approach that attempts to reproduce the full extent of SQLite support for joins, as in SQLite.swift.

    This approach checks "joins" on a list of features, but that's it. It requires SQL knowledge (one has to build the SQL join in one's head before "translating" it into Swift), and yet is less readable than SQL. And it leaves users stupid and naked when they are able to build a complex joined SQL query, but won't find how to express it with SQLite.swift. That's not the GRDB spirit.

    If GRDB ever supports joins, it will support a limited series of use cases that involve joins.

  • A failed attempt for GRDB: Joins DSL #103 (doc)

  • The "Association" chapter of the How to build an iOS application with SQLite and GRDB.swift article (near the end).

  • The Diesel approach (Diesel is a Rust SQL query builder)

  • The sample code at the end of the Custom Requests GRDB documentation chapter (the one that uses row adapters)

  • This GRDB article

Bring in your imagination, @hartbit, because it's more a design topic than a technical one!

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

I've added all those articles to my reading list. Will come back here once I've had time to digest them all.

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

What do you mean by the "syntax-based approach" of SQLite.swift? Can you explain in more detail what you don't like about it? I'm not a huge fan of it either, but it has the advantage of name-checking column names.

@groue
Copy link
Owner

groue commented Feb 27, 2017

"syntax-based approach"

I mean that SQLite.swift sticks to the SQL syntax, in its attempt at being able to generate as many SQL requests it can. It is a 1 to 1 mapping from Swift to SQL.

For example, it forces users to manage table and column aliases in order to avoid ambiguity, just like SQL.

It also forces the user to bend his mind around a Swift API that brings nothing to raw SQL. Why invent a new, poorer language, when SQL is already there? For me, it is is nothing but an academic exercice, with no practical value except Swift training. And it is a difficult new language: non trivial SQL queries are difficult to express with SQLite.swift: users have to sweat their Swift code, assuming they manage to write it.

Yeah, yeah, name-checking, type safety. Well, when name-checking and type safety prevent me from writing the SQL I need, they'd better back away. You may have noticed that SQL is a first-class citizen here: that's because when the GRDB query builder misses features, it's easy to plug some raw SQL and call it a day. You don't lose anything: fetched rows are still rows, fetched records are still records, etc.

A general-purpose join builder that mimicks SQL is not a goal, because it would be plain louzy. Even the status quo (no support for joins) is less louzy.

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

For example, it forces users to manage table and column aliases in order to avoid ambiguity, just like SQL

Yes, a good design should make it easy impossible to generate ambiguities.

Yeah, yeah, name-checking, type safety. Well, when name-checking and type safety prevent me from writing the SQL I need, they'd better back away. You may have notice that SQL is a first-class citizen here: that's because when the GRDB query builder misses features, it's easy to plug some raw SQL and call it a day. You don't lose anything: fetched rows are still rows, fetched records are still records, etc.

I totally understand, and thats why I love GRDB so much. But being able to express a query with the Query Interface allows a few niceties. For example, I've found it very useful to have QueryInterfaceRequest generating functions from my model objects and let the clients of those functions decide if they want to fetchCount or fetchAll. When writing raw SQL, I can't do that. I have to write two separate queries depending if I'm fetching or counting.

I'll continue reading the other material :)

@groue
Copy link
Owner

groue commented Feb 27, 2017

For example, it forces users to manage table and column aliases in order to avoid ambiguity, just like SQL

Yes, a good design should make it easy impossible to generate ambiguities.

The failed PR #103 (doc) did achieve this goal. But it was not welcomed by the early adopters who were nice enough to evaluate it (I'm so grateful!). Since then, I really think Diesel has nailed something important, but that's still fuzzy in my mind.

I'll continue reading the other material :)

I'm glad you do! Don't miss row adapters because I do think that they can help consuming joined queries (generating them is useless if one can't extract values from the results, isn't it?).

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

I'm glad you do! Don't miss row adapters because I do think that they can help consuming joined queries (generating them is useless if one can't extract values from the results, isn't it?).

Yes, I've been thinking about that a lot. But even without them, it's still interesting to be able to use joins to filter rows. In those cases, you wouldn't need a hypothetical "join-row-adapter". But I agree they should come in the picture.

@groue
Copy link
Owner

groue commented Feb 27, 2017

For example, I've found it very useful to have QueryInterfaceRequest generating functions from my model objects and let the clients of those functions decide if they want to fetchCount or fetchAll. When writing raw SQL, I can't do that. I have to write two separate queries depending if I'm fetching or counting.

We could imagine a general fetchCount method on the Request protocol:

extension Request {
    // Not tested
    func fetchCount(_ db: Database) throws -> Int {
        let (statement, _) = try prepare(db)
        let sql = statement.sql
        return try Int.fetchOne(db, "SELECT COUNT(*) FROM (\(sql))", arguments: statement.arguments)!
    }
}

Not the most efficient SQL ever, but it would work:

func people(in country: Country) -> AnyTypedRequest<Person> { ... }
let request = people(in: france)
let frenchies = try request.fetchAll(db)
let frenchCount = try request.fetchCount(db)

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

That sounds like a good thing to do, even if we end up having a join query interface :)

@hartbit
Copy link
Contributor Author

hartbit commented Feb 27, 2017

@groue What did you like in your original approach and would like to keep? What did you not like and would avoid? What did you enjoy in the Diesel approach? Just trying to understand to come up with a solution you have already thought of and already dismissed.

groue added a commit that referenced this issue Feb 28, 2017
@groue
Copy link
Owner

groue commented Feb 28, 2017

Some desired features of a future join API:

  • It is based on records, just like the current query interface.

    You have seen that the query interface does not generate SQL out of the blue in an abstract space. No: it starts from application records, and its main task is to feed those records:

    // [User]
    let users = try User
        .filter(emailColumn != nil)
        .fetchAll(db)

    vs, for example:

    // meh
    let users = try Table("users")  // gratuitous Table type
        .filter(emailColumn != nil)
        .fetchRows(db)              // who cares about rows...
        .map { User(row: $0) }      // ...when we need users?

    This should remain, as much as possible: the goal is that GRDB feeds the application, not the other way around. See this comment for another example of this idea.

  • It has a concept of association.

    Both the initial PR Joins DSL #103 and Diesel have this concept. It is already well known by users familiar with traditional ORMs. Those associations directly map to some database schema:

    • Author has many books, book belongs to author:

      +----------+    +---------+
      |  books   |    | authors |
      |----------|    |---------|
      | authorId |--->| id      |
      +----------+    +---------+
      
    • Country has many citizens through citizenships, citizen has many countries through citizenships:

      +-----------+    +--------------+    +----------+
      | countries |    | citizenships |    | citizens |
      |-----------|    |--------------|    |----------|
      | code      |<---| countryCode  |    |          |
      |           |    | citizenId    |--->| id       |
      +-----------+    +--------------+    +----------+
      
    • The has many ... through relation can be made a little bit more complex, when the intermediate table has extra columns:

      +-----------+    +--------------+    +----------+
      | countries |    | passports    |    | citizens |
      |-----------|    |--------------|    |----------|
      | code      |<---| countryCode  |    |          |
      |           |    | citizenId    |--->| id       |
      |           |    | number       |    |          |
      |           |    | deliveryDate |    |          |
      |           |    | expireDate   |    |          |
      +-----------+    +--------------+    +----------+
      

      One can also say from the schema above that passport belongs to country, passport belongs to its citizen, country has many passports, and citizen has many passports.

    • Traditionally, there is also the has one association, which is based on the same database schema than has many, with the difference that the application expects at most one matching row: vehicle has one insurance policy, insurance policy belongs to vehicle.

    We should take for granted that those associations pre-exist in the user's mind. They should be easy to express in GRDB.

  • It supports all primary keys.

    Records work just fine whenever the table has an auto-incremented integer primary key, another single-column primary key, a compound primary key, or even no primary key at all. This feature should be preserved, with schema inference when possible (see Database Schema Introspection)

The list of desired use cases based on associations has still to be established. I need to tell what I like about Diesel first.

@hartbit
Copy link
Contributor Author

hartbit commented Mar 6, 2017

After reading everything, it seems that the only feature that your old solution was missing compared to the points you mentioned above is relationships with cardinality (has many/has one/belongs to). Correct?

While the idea of using row adapters for handling disambiguation is interesting, I keep coming back to the idea that if columns and relations could be configured with the table they correspond to, we could have a simpler API:

extension Chapter {
    static let id = Column(table: Chapter.self, name: "id")
    static let title = Column(table: Chapter.self, name: "title")
    static let bookId = Column(table: Chapter.self, name: "bookId")
    static let book = Relation(column: Chapter. bookId, to: Book.self)
}

extension Book {
    static let id = Column(table: Book.self, name: "id")
    static let title = Column(table: Book.self, name: "title")
    static let authorId = Column(table: Book.self, name: "authorId")
    static let author = Relation(column: Book.authorId, to: Author.self)
}

extension Author {
    static let id = Column(table: Author.self, name: "id")
    static let name = Column(table: Author.self, name: "name")
}

let result = try Chapter
    .join(Chapter.book)
    .join(Book.author)
    .filter(Chapter.id == 1)
    .select(Author.name)
    .bound(to: String.self)
    .fetchOne(db)

If ambiguities come up because the same table is joined multiple times, perhaps we could use aliases. I know it's not very clean as is, but I don't easily think of joins in a hierarchical manner so this structure comes more naturally to me

let result = try Person
    .join(Person.mother, aliased: "mother")
    .join(Person.father, aliased: "father")
    .join(Person.mother, from: "mother", aliased: "mother_grandmother")
    .join(Person.father, from: "mother", aliased: "mother_grandfather")
    .join(Person.mother, from: "father", aliased: "father_grandmother")
    .join(Person.father, from: "father", aliased: "father_grandfather")

This is just some initial thoughts.

@hartbit
Copy link
Contributor Author

hartbit commented Mar 6, 2017

PS: The thing I like about attaching the column and relation info with the table is that once Swift gets more powerful features like property behaviours and macros, we should hopefully get those generated for free with a little bit of decoration.

@groue
Copy link
Owner

groue commented Mar 6, 2017

The API you propose is inspired by #103, isn't it? This PR did not find its public. Do you think that the removal of automatic management of aliases, the extra complexity of the Column type, and the replacement of strings with types, address those concerns (maybe they do)?

@hartbit
Copy link
Contributor Author

hartbit commented Mar 6, 2017

Here's my take on the criticisms of #103:

@mtissington says:

With that said, my question that comes to mind is what is the use case for implementing Joins?
Why should I learn to do it this way when I can already do everything in SQL? [...] It seems to me that in memory databases, for example, bring more value than an additional layer to support joins.

This sounds to me like a criticism of the feature itself, not its design or implementation. But I think we agree that a JOIN layer is useful for some people (I would strongly benefit from it). Having to resort to SQL is sometimes necessary, but we loose the many benefits of the query interface: compile-time checking of column and relation names, flexibility to adapt a query to change what is fetched, etc...

In @palpatim's answer, he starts with the following paragraph:

For me, the benefit of an ORM is transparently handling reads/writes to a database via a straight-up OO interface. It's been years since I used ActiveRecord, and then only lightly, but my recollection is that it had a declarative interface that allowed you to map relationships in the model and then not have to think about them too hard at the call site. When I see examples in your API draft documentation like let books = Book.include(author).fetchAll(db) it feels like we're going the opposite direction, requiring a developer to think too hard about table structures at the call site. [...] Recognizing that this is an early draft, is it part of your future direction to move to a model where a developer could largely ignore the underlying database, not just ignore SQL?

This seems to me like he prefers true ORM style libraries which do more for him. But GRDB has clearly distanced itself from them to avoid the famous 1+N problem. In that regard, your implementation was following GRDB philosophy.

Nonetheless, he says:

(Scoping rows especially seems burdensome.)

And I think I agree with that. Having to think in terms of hierarchical scopes is fairly different to how the SQL syntax represents them. My mental model has adapted to the way they are expressed in SQL and the hierarchical nature of the scopes got me a little confused. That's why I reverted to the alias solution in my code above.

@cfilipov echoes my thoughts:

I'd like to jump in and offer the opposite perspective to the previous comment. I don't want an ORM. I used GRDB specifically because it's a very nice Swift layer on top of SQLite. It lets me write my queries and map my models without much unnecessary baggage.

One of his negative comments about your proposal was:

In the example above, two relations (authors and publishers) are defined exactly the same way, they are both joined on books, but you have to use include for authors and join for publishers? I understand publishers is only used for the predicate, so this is likely why, but it's confusing.

This caught me off-guard initially also. But once I understand that it existed to quickly SELECT on the joined table fields, I've found it very convenient.

He later comments:

think the proposed API is too low level and complex for someone who wants the SQL abstracted away (as noted by @palpatim). But it's also too unfamiliar for someone who does know SQL and just wants a light wrapper in Swift. Personally, I would rather do more work dealing with the object graph if that means having a closer mapping to the SELECT grammar.

I think his remark stems from the same scoping and hierarchical issues I detailed above. But only he can say :)

As you see, I have the impression that most criticisms in that post were either about the philosophy behind GRDB (one user asking for less query interfacing, the other for more automation), and the few real criticisms seem the centre around the complexities of the scoping mechanism. So I'd tend to say that the direction you were going in was okay, but perhaps just needs some rework in terms of handling table ambiguities.

I hope that the example code I posted above is a potential direction.

@groue
Copy link
Owner

groue commented Mar 7, 2017

Hello @hartbit

Here's my take on the criticisms of #103:

That's really great, because there's a lot of information, desires, and fears there. I won't always agree with you, but I'll try of explain my points of view as clearly as I can.

@mtissington says:

Why should I learn to do it this way when I can already do everything in SQL?

This sounds to me like a criticism of the feature itself, not its design or implementation. But I think we agree that a JOIN layer is useful for some people (I would strongly benefit from it). Having to resort to SQL is sometimes necessary, but we loose the many benefits of the query interface: compile-time checking of column and relation names, flexibility to adapt a query to change what is fetched, etc...

@mtissington indeed does not need a join DSL. Does it mean that we can ignore his opinion? I'm not sure: I second him when I doubt that a DSL that mimics SQL (importing all its idiosyncrasies such as manual aliases, for example) has any value that goes beyond the joy of completing a funny intership project.

You cite "compile-time checking of column and relation names".

But this is only a facet of the query interface, and a much overblown one as a matter of fact. Nothing prevents you from using wrong columns with the wrong table, or misnamed columns (easily unnoticed), or full-text operators with non-full-text tables, etc. The world of query-interface mistakes is much wider than its compilation errors. You can avoid some of them with a strong discipline: long lists of static Column constants such as User.name, User.id, and requests that are eye-checked, like User.filter(User.email != nil). I don't like those lists, and eye checks are not compiler checks.

You also quote the "flexibility to adapt a query to change what is fetched".

The query interface is indeed based on method chaining: User.filter(...).order(...).... There is a warm promise of reusable requests here, where a fundamental request could be refined, adapted, when needed. But in reality this opportunity is a side-effect, and not a design goal of the query interface. Reusable code is rare, unfortunately. Ordering clauses don't chain (the best of worse designs). Generally code clarity and maintainability require that requests are fully defined from scratch, and do not derive from other pre-existing requests.

A bigger value of the query interface lies in its ability to talk to developers who aren't familiar with SQL. I added the query interface for one of my co-worker who didn't feel comfortable with SQL. When the trivial User.filter(emailColumn != nil) doesn't look much different from SELECT * FROM users WHERE email IS NOT NULL for a knowledgable developer, it's night and day for the inexperienced one.

#103 did not score well on that scale, as a matter of fact. And the query interface has support for advanced SQL features such as WHERE x NOT IN (SELECT ...), that almost certainly nobody uses, and show that my ideas on that subject has evolved:

Now I think that spending a lot of energy on a SQL-mimicking DSL (especially for joins) is bound to deceive both advanced developers (who will spend time converting natural SQL into Swift, and face some inevitable blind spots), and beginners (who will remain helpless and miserable due to their inability to build SQL joined queries in their mind). There is a real danger of wasting time here - for you, me, and GRDB users of various skills.

I'm sure we can avoid this trap. Let's see what @palpatim has to say:

In @palpatim's answer, he starts with the following paragraph:

For me, the benefit of an ORM is transparently handling reads/writes to a database via a straight-up OO interface. It's been years since I used ActiveRecord, and then only lightly, but my recollection is that it had a declarative interface that allowed you to map relationships in the model and then not have to think about them too hard at the call site.

What he means is that ActiveRecord lets you write: book.author (triggers a lazy loading of the book's author), author.books (same), author.books.size (count via SQL). GRDB can't follow this. Because it lacks the meta-programming abilities of Ruby, of course, but also because GRDB does not like lazy loading. Lazy loading does not fit with immutability, and GRDB derives much of its thread-safety from immutability. Lazy loading prevents the control of when the database is accessed, another fundamental of GRDB thread-safety. GRDB safety rules require an explicit instance of database queue or pool for accessing the database, not visible in expressions like book.author.

ActiveRecord is also able of Book.includes(:author).each { ... }, which iterates all books with their author, and avoids the 1+N problem. It can also do Author.include(:books).each { ... }. These are features we can aim at (#103 could only do the first, since the second involves at least two SQL queries).

Back to @palpatim, who has a valid request: "my recollection is that [ActiveRecord] had a declarative interface that allowed you to map relationships in the model and then not have to think about them too hard at the call site."

Those declarations are the relationships I talked about above, and that are well described in ActiveRecord documentation. And those relationships are concepts that fit well the beginner developers I'd like that we'd care about.

When I see examples in your API draft documentation like let books = Book.include(author).fetchAll(db) it feels like we're going the opposite direction, requiring a developer to think too hard about table structures at the call site. [...] Recognizing that this is an early draft, is it part of your future direction to move to a model where a developer could largely ignore the underlying database, not just ignore SQL?

I don't follow @palpatim here: the code he criticizes is almost identical to the Ruby books = Book.includes(:author).

This seems to me like he prefers true ORM style libraries which do more for him. But GRDB has clearly distanced itself from them to avoid the famous 1+N problem. In that regard, your implementation was following GRDB philosophy.

Good ORM libraries avoid the 1+N problem, too. What GRDB doesn't inherit from traditional ORM is lazy loading, record auto-updating, record uniquing (because immutability), and automatic relationships methods (because it can't).

#103 also had a strong SQL smell that did repell @palpatim - and I have explained above why I also doubt, now, that sticking to SQL is a valuable option. Of course relationships are fundamentally based on the database schema, and any documentation has to illustrate relationships with database tables (example). But talking about the database schema does not imply forcing SQL down the reader's throat.

Nonetheless, he says:

(Scoping rows especially seems burdensome.)

And I think I agree with that. Having to think in terms of hierarchical scopes is fairly different to how the SQL syntax represents them. My mental model has adapted to the way they are expressed in SQL and the hierarchical nature of the scopes got me a little confused. That's why I reverted to the alias solution in my code above.

Yes. Scopes were the main #103 novelty, and failure. They were used as paths to navigate inside the linear selection of an SQL query (accessing a and b columns in SELECT a.*, b.* FROM a JOIN b ...). This may be a documentation failure. Or, more certainly, an inadequate framing of the initial problem. Yet, without scopes, we'll have to find another way to access the various parts of the selection. I let you consider this question, as a though experiment. And we'll see below that Diesel has one valid answer.

@cfilipov echoes my thoughts:

I'd like to jump in and offer the opposite perspective to the previous comment. I don't want an ORM. I used GRDB specifically because it's a very nice Swift layer on top of SQLite. It lets me write my queries and map my models without much unnecessary baggage.

Isn't this a variation on @mtissington's point? Anyway, you're both right. But I wonder how he maps SELECT a.*, b.* FROM a JOIN b ... to As and Bs without "much unnecessary baggage". Maybe he uses numerical column indexes. Or he writes the SQL so that the selection contains distinct column names. This totally works, but an eventual join DSL could help a lot here.

One of his negative comments about your proposal was:

In the example above, two relations (authors and publishers) are defined exactly the same way, they are both joined on books, but you have to use include for authors and join for publishers? I understand publishers is only used for the predicate, so this is likely why, but it's confusing.

This caught me off-guard initially also. But once I understand that it existed to quickly SELECT on the joined table fields, I've found it very convenient.

Yes. joins vs. includes were inspired by ActiveRecord. He missed it because the draft documentation was very draft.

He later comments:

think the proposed API is too low level and complex for someone who wants the SQL abstracted away (as noted by @palpatim). But it's also too unfamiliar for someone who does know SQL and just wants a light wrapper in Swift. Personally, I would rather do more work dealing with the object graph if that means having a closer mapping to the SELECT grammar.

I think his remark stems from the same scoping and hierarchical issues I detailed above. But only he can say :)

As you see, I have the impression that most criticisms in that post were either about the philosophy behind GRDB (one user asking for less query interfacing, the other for more automation), and the few real criticisms seem the centre around the complexities of the scoping mechanism. So I'd tend to say that the direction you were going in was okay, but perhaps just needs some rework in terms of handling table ambiguities.

I agree with your premises. We may need more than a rework.

I hope that the example code I posted above is a potential direction.

You must have guessed now that I fear encouraging you to continue in that direction.

It's time to turn on some lights, and talk about Diesel.

Diesel

Their documentation starts with:

Note: This feature is under active development, and we are seeking feedback on the APIs that have been released

You bet! The landscape of database libraries in the context of new languages like Swift and Rust, that both foster immutability and protocol(traits)-oriented programming, is bleeding edge. Realm is so 2010! Being experimental is our moral duty 😉

What's the Diesel proposal?

They start with two records:

#[derive(Identifiable, Queryable, Associations)]
#[has_many(posts)]
pub struct User {
    id: i32,
    name: String,
}

#[derive(Identifiable, Queryable, Associations)]
#[belongs_to(User)]
pub struct Post {
    id: i32,
    user_id: i32,
    title: String,
}

Users and posts are bound with a has many and a belongs to relationships. These are the only relationships supported by Diesel.

Note how Diesel records are similar to GRDB records: columns are supported by properties, and they can be implemented as value types (structs) if the library user wants to. Diesel and GRDB share the same fundamentals about what a database model is.

Diesel can load a user's posts:

let user = try!(users::find(1).first(&connection));
let posts = Post::belonging_to(&user).load(&connection);

So far, so good. Diesel can also load all the (User, Post) pairs:

let data: Vec<(User, Post)> = users::table.inner_join(posts::table).load(&connection);

Meh. The value of this is limited: the SELECT users.*, posts.* FROM users JOIN posts ON users.id = posts.user_id query repeats users as many times as they have posts, and this Vec<(User, Post)> is not easily used. We'd prefer something like Vec<(User, Vec<Post>)>, which would allow iterating a set of distinct users, along with their posts.

Fortunately, they provide some sample code that builds this Vec<(User, Vec<Post>)>:

fn first_twenty_users_and_their_posts(conn: &PgConnection) -> QueryResult<Vec<(User, Vec<Post>)>> {
    let users = try!(users::limit(20).load::<User>(conn));
    let posts = try!(Post::belonging_to(&users).load::<Post>(conn));
    let grouped_posts = posts.grouped_by(&users);
    users.into_iter().zip(grouped_posts).collect()
}

And there ends their doc/demo.

That doesn't look very sexy, hu? But we don't know what terrible problems they had to deal with.

Let's forget about the form of their API, and focus on its susbtance. There lies the value: Diesel loads object graphs into standard collections and tuples of records.

Especially:

  • they don't load rows that need to be navigated into. The breakdown of row columns has happened inside Diesel (not in userland code as with Joins DSL #103's publicly exposed scopes.)
  • the navigation inside the object graph is not performed through records. Diesel's user has no method for accessing her posts, and posts doesn't know its user. Instead, the developer is given the graph he has requested (and in the example above, a list of users with their posts: [(User, [Posts])]. Joins DSL #103 had hesitations on that topic.

These ideas are highly inspiring, because they completely avoid whole classes of problems, questions, and extrapolations (feature wishes). No lazy loading. No visible navigation inside the linear database rows. No record methods like post.user or user.posts which means that records can be used both in isolation and in a graph without introducing extra optional properties or methods (and the desire to have lazy versions of those).

Caveat: the number of available join queries has to be limited. Swift or Rust don't compile methods that generate recursive arrays of tuples of arrays of arrays of tuples of arbitrary depth. Static typing is a hard limit.

But within those limits, there are plenty of opportunities to provide useful methods that load useful object graphs in efficient ways.

Yes, support for object graphs is not the same as support for joins. But there is an overlap.

I'd like to see what could be achieved this way!

@groue
Copy link
Owner

groue commented Mar 7, 2017

A side note about request derivation (the feature that I claim to be too much valued, considering how rarely reusable code is).

Sometimes code can be reused. GRDB provides some help for derivation of your custom SQL requests, thanks to the StatementArguments type.

StatementArguments is the type that feeds SQL placeholders. It adopts both ExpressibleByArrayLiteral and ExpressibleByDictionaryLiteral so that you can feed arguments with arrays or dictionaries:

// func fetchOne(_ db: Database, _ sql: String, arguments: StatementArguments? = nil)
User.fetchOne(db, "SELECT * FROM users WHERE id = ?", arguments: [1])
User.fetchOne(db, "SELECT * FROM users WHERE id = :id", arguments: ["id": 1])

Positional (?) and named (:id) arguments can be concatenated, and even mixed (see documentation for more information).

This allows custom requests to build several queries that reuse a common SQL template, and merge various arguments together:

struct MyRecord: RowConvertible {
    static func request1(...) -> AnyTypedRequest<MyRecord> {
        return makeRequest("...", arguments: [...])
    }
    
    static func request2(...) -> AnyTypedRequest<MyRecord> {
        return makeRequest("...", arguments: [...])
    }
    
    static func request3(...) -> AnyTypedRequest<MyRecord> {
        return makeRequest("...", arguments: [...])
    }
    
    private static func makeRequest(_ sqlSuffix: String, arguments suffixArguments: StatementArguments? = nil) -> AnyTypedRequest<MyRecord> {
        // Base sql and arguments
        var sql = "SELECT ... FROM ... "
        var arguments: StatementArguments = [...]
        
        // Append suffix sql and eventual suffix arguments
        sql += sqlSuffix
        if let suffixArguments = suffixArguments {
            arguments += suffixArguments
        }
        
        // Build final request:
        return SQLRequest(sql, arguments: arguments).bound(to: MyRecord.self)
    }
}

try MyRecord.request1(...).fetch...
try MyRecord.request2(...).fetch...
try MyRecord.request3(...).fetch...

@hartbit
Copy link
Contributor Author

hartbit commented Mar 7, 2017

That's really great, because there's a lot of information, desires, and fears there. I won't always agree with you, but I'll try of explain my points of view as clearly as I can.

I think we agree more than you think :)

@mtissington indeed does not need a join DSL. Does it mean that we can ignore his opinion? I'm not sure: I second him when I doubt that a DSL that mimics SQL (importing all its idiosyncrasies such as manual aliases, for example) has any value that goes beyond the joy of completing a funny intership project.

I think its a question of degree. Mimicking SQL down to the syntax is not necessarily useful. As you pointed out at the beginning of the discussion, "This approach [...] requires SQL knowledge (one has to build the SQL join in one's head before "translating" it into Swift), and yet is less readable than SQL.". But when the layer adds useful features on top of a layer of convenience, like the current Query Interface does, it brings value to both experts and newcomers. I think its a question of finding the right balance.

But this is only a facet of the query interface, and a much overblown one as a matter of fact. Nothing prevents you from using wrong columns with the wrong table, or misnamed columns, or full-text operators with non-full-text tables, etc. The world of query-interface mistakes is much wider than its compilation errors.

I agree with you that the world of mistakes is much wider. But I find compile time checking of names very useful when I work, and for more than just avoiding typos. I can make a typo in my stringl-y typed SQL or use the wrong column in my query interface and in both cases I will notice it pretty quickly and fix it. But where it really shines is when I'm reworking my code. Imagine that I rename or remove a field in a Record. The compiler will tell me straight away in what queries I was relying on that field so I know exactly what I need to fix. Much more powerful than having to Search & Replace. It's the same reasoning behind Swift's introduction of #keyPath() and #selector().

You can avoid some of them with a strong discipline: long lists of static Column constants such as User.name, User.id, and requests that are eye-checked, like User.filter(User.email != nil). I don't like those lists, and eye checks are not compiler checks.

I know that this is not ideal. But it seems like something that can and should be improved when Swift gets more metaprograming/reflection features. For example, if Column constants could be automatically generated (I already do this with Sourcery by the way), we could imagine an API like this:

User.filter({ $0.email != nil })

where the argument passed is of type User.Type, allowing us to more naturally reference its column constants and avoid the necessary eye-checking.

The query interface is indeed based on method chaining: User.filter(...).order(...).... There is a warm promise of reusable requests here, where a fundamental request could be refined, adapted, when needed. But in reality this opportunity is a side-effect, and not a design goal of the query interface. Reusable code is rare, unfortunately. Code clarity and maintainability much often require that requests are fully defined from scratch, and do not derive from other pre-existing requests.

Reusable code between projects is rare indeed, and thats where I apply the Rule of Three. But inside a project, I often have the need to reuse some queries. For example:

// If you could easily write a join query to return a `QueryInterfaceRequest
func favoritePosts(for user: User) -> QueryInterfaceRequest<Post> {
    return Post
        .join(User.favoritePosts)
        .filter(!Post.isDeleted && User.id == user.id)
}

// it makes it that much nicer to get the number of favorite posts
let numberOfFavorites = try favoritePosts(for: user).fetchCount()
// or get the favorited posts more recent than a date
let recentFavorites = try favoritePosts(for: user).filter(Post.date > lastYear).fetchAll()
// or even join with another table
let favoritePostsWithAuthor = try favoritePosts(for: user).include(Post.author).fetchAll()

A bigger value of the query interface lies in its ability to talk to developers who aren't familiar with SQL.

It's definitely useful for those users. I just have a strong feeling that it also very valuable to developers more familiar with SQL. But perhaps I am simply projecting my own uses of the library.

What he means is that ActiveRecord lets you write: book.author (triggers a lazy loading of the book's author), author.books (same), author.books.size (count via SQL). GRDB can't follow this.

Yes, that's what I undestood from what he said. And I also agree and understand why GRDB can't follow this.

Good ORM libraries avoid the 1+N problem, too. What GRDB doesn't inherit from traditional ORM is lazy loading, record auto-updating, record uniquing (because immutability), and automatic relationships methods (because it can't).

Side discussion concerning immutability. Is it's a strong goal of GRDB, why does Record exist? It seems like a safe default to choose and the way it works does play well with immutability.

You must have guessed now that I fear encouraging you to continue in that direction.

Don't worry. It was just an idea to spur discussion :)

they don't load rows that need to be navigated into. The breakdown of row columns has happened inside Diesel (not in userland code as with #103's publicly exposed scopes.)

What do you mean by that?

the navigation inside the object graph is not performed through records. Diesel's user has no method for accessing her posts, and posts doesn't know its user. Instead, the developer is given the graph he has requested (and in the example above, a list of users with their posts: [(User, [Posts])]. #103 had hesitations on that topic.

I totally agree with this direction. Getting a [(User, [Posts])] back would play very nicely with Swift and GRDB's goals.

I'd like to see what could be achieved this way!

Me too!

Btw, haven't had time to read your last comment. Will loot at it this evening.

@groue
Copy link
Owner

groue commented Mar 8, 2017

That's really great, because there's a lot of information, desires, and fears there. I won't always agree with you, but I'll try of explain my points of view as clearly as I can.

I think we agree more than you think :)

I'm glad we do! We have both exposed some options in the most radical way, and now we can focus on "finding the right balance", as you say.

Can we agree on a new wording for our goal?

Codename "Graph API": an API that loads object graphs built from relationships between record types.

Documentation will introduce relationships between record types based on patterns of database tables, and expose how GRDB lets the user fetch a set of graphs of objects.

The SQL point of view on the Graph API will be introduced as an extra, for advanced users.

Compilation Checks

I've tried to show that this topic is over-pushed, and you have replied with valid arguments considering your coding practices.

Those two points of view reflect two ways to use GRDB: one that writes as little code as possible (with caveats), and one that goes as far as using Sourcery in order to generate a heavy load of boilerplate compiler constants that help trigger some compiler errors in case of refactoring.

Until now, GRDB has been able to welcome both practices.

And there is no User.filter({ $0.email != nil }) because I believe such methods can live in application extensions until they get solid enough to form an autonomous GRDB-based library that can live in its own repository (please open an issue if you face problems writing such an extra layer).

Unlike #103, the Graph API will be heavily Record-based. New protocols will possibly be introduced, with new constraints on adopting types. Yet you know that I'll look at boilerplate with a critic eye. I don't think that the Graph API will need much declarations beyond individual relationships:

struct Book {
    static let author = BelongsTo(Author.self, ...)
}
struct Author {
    static let books = HasMany(Book.self, ...)
}

// [(Author, [Books])]
let graph = try Author.with(Author.books).fetchAll(db)

And I'll leave Author.with({ $0.books }) to extensions.

Request Refinement

You're right, request refinement is useful. I was trying to say that it is not its major feature, and that we can live with Graph requests that can not be refined.

For example, how do you refine the filtering of books in Author.with(Author.books)? You can't. You have to directly write Author.with(Author.books.filter(type == .novel)).

We can keep request refinement in a corner of our head, but it must not jeopardize the legibility of the Graph API.

Beginner Developers

A bigger value of the query interface lies in its ability to talk to developers who aren't familiar with SQL.

It's definitely useful for those users. I just have a strong feeling that it also very valuable to developers more familiar with SQL. But perhaps I am simply projecting my own uses of the library.

I have started gathering all the JOIN requests that exist in the .swift and .m files on my computer. I hope that'll help figuring out which ones are reasonable to support. Anyway, the reframing of "join DSL" into "Graph API" should already prevent us from too much projection: we'll have to try to reframe our joins in terms of graph :-)

Immutability

Good ORM libraries avoid the 1+N problem, too. What GRDB doesn't inherit from traditional ORM is lazy loading, record auto-updating, record uniquing (because immutability), and automatic relationships methods (because it can't).

Side discussion concerning immutability. Is it's a strong goal of GRDB, why does Record exist? It seems like a safe default to choose and the way it works does play well with immutability.

GRDB never mutates your records on your back.

Record is a class because it needs private storage for changes tracking. And it behaves like a value type if all properties are let. And it exists because you get the full GRDB toolkit with one word.

I think I didn't understand your question.

Diesel

they don't load rows that need to be navigated into. The breakdown of row columns has happened inside Diesel (not in userland code as with #103's publicly exposed scopes.)

What do you mean by that?

I mean that the extraction of a and b columns loaded from SELECT a.*, b.* FROM ... has been done inside Diesel, which outputs A and B records.

the navigation inside the object graph is not performed through records. Diesel's user has no method for accessing her posts, and posts doesn't know its user. Instead, the developer is given the graph he has requested (and in the example above, a list of users with their posts: [(User, [Posts])]. #103 had hesitations on that topic.

I totally agree with this direction. Getting a [(User, [Posts])] back would play very nicely with Swift and GRDB's goals.

That sentence of yours sounds like honey to me. I'm sure we'll build something valuable!

@hartbit
Copy link
Contributor Author

hartbit commented Mar 9, 2017

Those two points of view reflect two ways to use GRDB: one that writes as little code as possible (with caveats), and one that goes as far as using Sourcery in order to generate a heavy load of boilerplate compiler constants that help trigger some compiler errors in case of refactoring.

And it's great that GRDB allows you to work both ways, even in same project, with very little friction.

And there is no User.filter({ $0.email != nil }) because I believe such methods can live in application extensions until they get solid enough to form an autonomous GRDB-based library that can live in its own repository (please open an issue if you face problems writing such an extra layer).

Sounds like a great idea :)

For example, how do you refine the filtering of books in Author.with(Author.books)? You can't. You have to directly write Author.with(Author.books.filter(type == .novel)).

I think that as long as there is an easy way to filter joins (as in your example above), I think we can live without the refinement feature.

I have started gathering all the JOIN requests that exist in the .swift and .m files on my computer. I hope that'll help figuring out which ones are reasonable to support. Anyway, the reframing of "join DSL" into "Graph API" should already prevent us from too much projection: we'll have to try to reframe our joins in terms of graph :-)

I'll look into mine too. Once an API starts emerging, I'll look through them to see which can and can't be implemented with the Graph API.

GRDB never mutates your records on your back.

Yes, of course :)

Record is a class because it needs private storage for changes tracking. And it behaves like a value type if all properties are let. And it exists because you get the full GRDB toolkit with one word. I think I didn't understand your question.

I didn't express my point well enough. What I meant is that, because I use Record, and because its a class, I've already experienced bugs where I had manipulated a Record in parts of my application and another part of the application holding to the same reference got very confused by that :) It would not have happened with structs.

My point is that Record is so prominently advertised in the documentation that it almost feels like the natural type to use. I think it might be worth spending more time advertising the use of value types as you did in your blog post.

That sentence of yours sounds like honey to me. I'm sure we'll build something valuable!

Where do we go from here? Some more questions?

  • Do you still intend to support the case where a join is used for filtering purpuses and those records should not be included in the results (include vs join)? Example:
let authorsAndParagraphsInEvenChaptersOfNovels: [(Author, [Paragraph])] =
    Author.join(Author.books.filter(Book.type == .novel))
          .join(Book.chapters.filter(Chapter.index % 2 == 0))
		  .with(Chapter.paragraphs)
// SELECT authors.*, chapters.*
// LEFT JOIN books ON authors.id = books.authorId AND books.type = 'novel'
// LEFT JOIN chapters ON books.id = chapters.bookId AND chapters.index % 2 = 0
// LEFT JOIN paragraphs ON chapters.id = paragraphs.chapterId
  • Do you still intend to only allow filtering inside a join call as a way to solve ambiguities? Example:
  • Do you intend to allow specifying between LEFT JOIN and INNER JOIN semantics? Sounds like something worth it. I know that INNER can be implemented with LEFT and a NOT NULL condition, but sounds verbose to force users to specify it this way.

@groue
Copy link
Owner

groue commented Mar 10, 2017

I have started gathering all the JOIN requests that exist in the .swift and .m files on my computer. I hope that'll help figuring out which ones are reasonable to support. Anyway, the reframing of "join DSL" into "Graph API" should already prevent us from too much projection: we'll have to try to reframe our joins in terms of graph :-)

I'll look into mine too. Once an API starts emerging, I'll look through them to see which can and can't be implemented with the Graph API.

Yes @hartbit. That'll help us evaluate the range of use cases that we need/wish to support.

GRDB never mutates your records on your back.

Yes, of course :)

Record is a class because it needs private storage for changes tracking. And it behaves like a value type if all properties are let. And it exists because you get the full GRDB toolkit with one word. I think I didn't understand your question.

I didn't express my point well enough. What I meant is that, because I use Record, and because its a class, I've already experienced bugs where I had manipulated a Record in parts of my application and another part of the application holding to the same reference got very confused by that :) It would not have happened with structs.

My point is that Record is so prominently advertised in the documentation that it almost feels like the natural type to use. I think it might be worth spending more time advertising the use of value types as you did in your blog post.

I see what you mean. Until now, Record has served me well, but I have faced troubles as well, and my GRDB best practices are still in flux ;-)

My current bottom line is: the database is the only source of truth. Any fetched record may quicky drift away from the truth since it is essentially a cached value. For example, as a screen is displaying some user's info, a background web request may update the user in the database: the displayed user record may be outdated. As stated in Four different ways to handle SQLite concurrency, such conflicts must be handled by the application (and a framework like Core Data makes conflict handling implicit and hard to deal with - while GRDB prefers to make it explicit, hoping that particular conflicts are easier to deal with than general ones).

That's why a read-only screen may use a FetchedRecordsController, and get automatic updates. A read/write screen may use the update(columns:) method in order to update only a subset of columns. And generally speaking, records should be fetched often, and remain short-lived objects. If you are familiar with server-side development, it can help thinking of application screens as web pages whose generation is based on freshly fetched data. This avoids some Record rot.

All those ideas are difficult to document and foster (assuming they are actual best practices)! And I'm not familiar enough with other application architectures like reactive programming and MVVM to know how they could reframe the question. Meanwhile Record is the simpler type. But you're right, the Record introduction could have a paragraph that targets advanced users and talks about the extra advantages of structs + protocols (without introducing FUD and making Record an unsound base class).

That sentence of yours sounds like honey to me. I'm sure we'll build something valuable!

Where do we go from here? Some more questions?

  • Do you still intend to support the case where a join is used for filtering purpuses and those records should not be included in the results (include vs join)? Example:
let authorsAndParagraphsInEvenChaptersOfNovels: [(Author, [Paragraph])] =
    Author.join(Author.books.filter(Book.type == .novel))
          .join(Book.chapters.filter(Chapter.index % 2 == 0))
		  .with(Chapter.paragraphs)
// SELECT authors.*, chapters.*
// LEFT JOIN books ON authors.id = books.authorId AND books.type = 'novel'
// LEFT JOIN chapters ON books.id = chapters.bookId AND chapters.index % 2 = 0
// LEFT JOIN paragraphs ON chapters.id = paragraphs.chapterId
  • Do you still intend to only allow filtering inside a join call as a way to solve ambiguities? Example:
  • Do you intend to allow specifying between LEFT JOIN and INNER JOIN semantics? Sounds like something worth it. I know that INNER can be implemented with LEFT and a NOT NULL condition, but sounds verbose to force users to specify it this way.

I'd like to reframe those questions: since the Graph API will be documented as a way to fetch record graphs from the database, not as a way to perform joins, we'll implement the SQL needed to implement the graph use cases. Not the other way around. Our knowledge of SQL will be used, but it should not drive the use cases.

You're asking what's next, here are some ideas:

  • Look at our real-life sql joins. Decide which ones could be expressed as a graph request, and would be a great service to users who don't know SQL.
  • Write some pseudo code that we could discuss and compare. Focus on use cases that involve a single relation.
  • Keep the public API simple (if implementation has to be complex, so be it).
  • This is no longer about joins: if one can load all books with their (unique) authors, then consistency requires that one can load all all authors with their (many) books. This last graph request is unlikely to be implemented as a single joined SQL query.
  • Many joined requests will not be generated in the first release.
  • Many joined requests will never be generated by the Graph API.

That's at least what I intend to do. Does it look sensible to you?

@groue
Copy link
Owner

groue commented Mar 11, 2017

Cardinalities

It's worth detailing relationships and the cardinalities guaranteed by the database, that is to say how many objects are involved in both sides of each relationship.

Formal cardinalities (in data modelling) and concrete ones (as implemented by a relational database) do not exactly overlap. For example, "at least one" is not easily expressed in SQL.

I'll focus on concrete ones because they map well with the Swift type system. There are only three concrete relational cardinalities: 0,1, 1, and 0,N, that map exactly on Swift's optional types, non optional types, and collections. For example, when an author "has many" books (0,N), a book may have a mandatory (1) or optional (0,1) author.

  • Parent has many children, child belongs to parent:

    +----------+    +---------+
    | children |    | parents |
    |----------|    |---------|
    | parentId |--->| id      |
    +----------+    +---------+
    

    At the SQL level, we have two variants, depending on whether the children.parentId column can be null or not:

    -- Common
    CREATE TABLE parents (
        id INTEGER PRIMARY KEY, ...
    )
    -- Variant 1
    CREATE TABLE children (
        parentId INTEGER REFERENCES parents(id)
    )
    -- Variant 2
    CREATE TABLE children (
        parentId INTEGER NOT NULL REFERENCES parents(id)
    )

    Both schemas guarantee that any given parent has from zero to any number of children (0,N). The first variant allows children with optional parent (0,1), while the second one guarantees that children have a mandatory (1) parent.

    [parent] --- 0,N --> [child] --- 0,1 --> [parent]
    [parent] --- 0,N --> [child] ---- 1 ---> [parent]
    
  • Parent has one child, child belongs to parent:

    +----------+    +---------+
    | children |    | parents |
    |----------|    |---------|
    | parentId |--->| id      |
    +----------+    +---------+
    

    This is the same schema involved in a has many relationship, with the same variants based on the nullability of children.parentId, and an extra unique index on children.parentId:

    -- Common
    CREATE TABLE parents (
        id INTEGER PRIMARY KEY, ...
    )
    -- Variant 1
    CREATE TABLE children (
        parentId INTEGER UNIQUE REFERENCES parents(id)
    )
    -- Variant 2
    CREATE TABLE children (
        parentId INTEGER UNIQUE NOT NULL REFERENCES parents(id)
    )

    Both schemas guarantee that any given parent has zero or one child (0,1). The first variant allows children with optional parent (0,1), while the second one guarantees that children have a mandatory (1) parent.

    [parent] --- 0,1 --> [child] --- 0,1 --> [parent]
    [parent] --- 0,1 --> [child] ---- 1 ---> [parent]
    
  • Left has and belongs to many rights, right has and belongs to many lefts:

    +-------+    +---------+    +--------+
    | lefts |    | links   |    | rights |
    |-------|    |---------|    |--------|
    | id    |<---| leftId  |    |        |
    |       |    | rightId |--->| id     |
    +-------+    +---------+    +--------+
    

    This relationships is symmetrical, and the inner links table is an implementation detail that only contains foreign keys. At the SQL level, a proper table definition wants links to have a (leftId, rightId) primary key:

    CREATE TABLE lefts (
        id INTEGER PRIMARY KEY, ...
    )
    CREATE TABLE rights (
        id INTEGER PRIMARY KEY, ...
    )
    CREATE TABLE links (
        leftId INTEGER NOT NULL REFERENCES lefts(id)
        rightId INTEGER NOT NULL REFERENCES rights(id)
        PRIMARY KEY (leftId, rightId)
    )
    [left] --- 0,N --> [right] --- 0,N --> [left]
    

@groue
Copy link
Owner

groue commented Mar 11, 2017

Relationships

Now that cardinalities have been detailed, we have a better idea of the Swift relationships we can express:

Relationship Domain
Belongs to (optional) [(Child, Parent?)]
Belongs to (mandatory) [(Child, Parent)]
Has many [(Parent, [Child])]
Has one [(Parent, Child?)]
Has and belongs to many [(Left, [Right])]

We have two variants of belongs to, based on the eventual nullability of the foreign key: an optional one and a non-optional one.

Should both variants be provided in Swift, or should we only provide the optional one?

This leads to a broader question which is the tolerance for loose schemas. A loose database database schema does not use database constraints to enforce the intent of the programmer.

I personally like strict schemas, and GRDB can be rigid sometimes. For example, look at that quote from the fatal errors documentation:

[You get a fatal error when] the database can't guarantee that the code does what it says:

// fatal error: table persons has no unique index on column email
try Person.deleteOne(db, key: ["email": "[email protected]"])

Solution: add a unique index to the persons.email column, or use the deleteAll method to make it clear that you may delete more than one row:

try Person.filter(Column("email") == "[email protected]").deleteAll(db)

But the Graph API has to be much more tolerant, because beginner developers are an important target. And a beginner developer, by definition, is not expert in relational constraints. We'll have to evaluate the consequences of loose schemas on our APIs: some bugs can be tolerated (such as the consequences of a missing unique index in a has one relationship), and some bugs should be carefully evaluated (data loss opportunities, and code that trumps the reader (such as the deleteOne above)).

Distinguishing between belongs to (optional) and belongs to (mandatory) allows to load optional or non-optional parents, but makes the Swift API more complex, and more exposed to bugs introduced by loose schemas.

Supporting only one variant, belongs to (optional), has the advantage of simplifying the API, and supports more loose schemas.

Finally, a single belongs to (optional) relation can still be lately used as a mandatory one (when the user declares an optional relationships, and later uses it to load non-optional parents).

This makes me think that here is our minimal list of relationships:

Relationship Domain
Belongs to [(Child, Parent?)]
Has many [(Parent, [Child])]
Has one [(Parent, Child?)]
Has and belongs to many [(Left, [Right])]

Some domains overlap, but differences remain in the declaration of relationships:

Belongs to and has one have the same domain, but the foreign key is in the table at the origin of the belongs to relationship.

Has many and Has and belongs to many have the same domain, but the latter involves an intermediate table.

Maybe we'll reintroduce belongs to (mandatory) later, when we evaluate our Swift API proposals.

@groue
Copy link
Owner

groue commented Mar 15, 2017

Take your time, @hartbit.

I'd still like to discuss the loss of belongs to (mandatory)

I'll try to keep a single belongs to (optional), and push the choice between optional/non-optional until usage (as joined(with:) / joined(withOptional:) above). Not foolproofed yet. And I haven't yet thoroughly analysed which Swift construct needs JOIN/LEFT JOIN, nor whether filters on the relation should feed ON or WHERE, even less the consequences of the nullability of the belongs to foreign key.

Could you explain in more detail why annotated(with:) does not load a [(Person, AggregatedValue)] graph?

Nothing against it! But I don't know how to efficiently specify the type of the extracted value.

The lack of typed expression is at play here: GRDB doesn't know the type of columns and derived expressions, because it promotes late conversion (as we have discussed in #183). We don't know the type of values returned by max(Column("year")), or sum(Column("price")).

The rationale against typed expressions (fostering record properties as the source of type safety instead of boilerplate lists of column declarations + robustness against oddities in the dynamic typing of values in SQLite) is subtle. My naive goal is that one does not actually miss typed expressions. But I can't prevent people from thinking they do. The peer pressure for Swift type cathedrals is very high these days.

And here we actually do miss them, since graphs are not backed by record properties.

We could have a middle ground:

// All persons with the year of their most recent book
// [(Person, Int)]
let authoredBookYear = Person.authoredBooks["year"] // or ...[Column("year")]
let graph = Person.annotated(with: max(authoredBookYear).as(Int.self))

I'm not sure.

@hartbit
Copy link
Contributor Author

hartbit commented Mar 15, 2017

Yeah I was thinking of that last example.

@groue
Copy link
Owner

groue commented Mar 15, 2017

Yes. It closes for good the opportunity for derived records, but it's a choice we could live with.

@hartbit
Copy link
Contributor Author

hartbit commented Mar 15, 2017

Why would it necessarily close the opportunity? It's still a viable option for who wants to do that, no?

@groue groue mentioned this issue Apr 7, 2017
@smhk
Copy link

smhk commented Apr 29, 2017

I want to quickly familiarize myself with GRDB, so as to contribute to this fantastic thinking :)

@groue groue mentioned this issue Jun 2, 2017
18 tasks
@groue
Copy link
Owner

groue commented Jun 6, 2017

Everything has a start! https://github.com/groue/GRDB.swift/tree/Graph has preliminary support for "has many" associations.

First declare records and their associations:

struct Child : TableMapping, RowConvertible {
    static let databaseTableName = "children"
    
    let id: Int64
    let parentId: Int64
    let name: String
    
    init(row: Row) { ... }
}

struct Parent : TableMapping, RowConvertible {
    static let databaseTableName = "parents"
    
    // Associations:
    static let children = hasMany(Child.self)
    static let filteredChildren = children.filter(Column("name") == "foo")
    
    let id: Int64
    let name: String
    
    init(row: Row) { ... }
}

Setup the database with foreign keys so that GRDB can infer involved columns:

let dbQueue: DatabaseQueue = ...
try dbQueue.inDatabase { db in
    try db.create(table: "parents") { t in
        t.column("id", .integer).primaryKey()
        t.column("name", .text)
    }
    
    try db.create(table: "children") { t in
        t.column("id", .integer).primaryKey()
        t.column("parentId", .integer).references("parents") // <-- there
        t.column("name", .text)
    }
}

And now you can use the associations:

try dbQueue.inDatabase { db in
    // All parents with their children
    let graphRequest = try Parent.including(Parent.children)
    
    // Altered children
    let orderedChildren = Parent.children.order(Column("name"))
    let graphRequest = try Parent.including(orderedChildren)
    
    let graphRequest = try Parent.including(Parent.filteredChildren)
    
    // Altered parents
    let graphRequest = try Parent
        .filter(Column("name") == "foo") // Alteration before `including`
        .including(Parent.children)
    
    let graphRequest = try Parent
        .including(Parent.children)
        .filter(Column("name") == "foo") // Alteration after `including`
    
    // The graph variable is of type [(Parent, [Child])]
    // Two SQL queries are executed:
    //     SELECT * FROM parents ...
    //     SELECT * FROM children WHERE parentID IN (...) ....
    let graph = try graphRequest.fetchAll(db)
    for (parent, children) in try graph {
        ...
    }
}

@groue
Copy link
Owner

groue commented Jun 15, 2017

Implemented so far:

// BelongsTo

struct Book {
    static let author = belongsTo(Author.self)
    let authorId: Int64?
    ...
}

struct Author {
    ...
}

let oldBooks = Book.filter(Column("year") < 2000)
let orderedBooks = Book.order(Column("title")
let twentiethAuthor = Book.author.filter(Column("birthYear") >= 1900)

let graph: [(Book, Author)] = try Book.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try oldBooks.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try orderedBooks.joined(with: Book.author).fetchAll(db)
let graph: [(Book, Author)] = try Book.joined(with: twentiethAuthor).fetchAll(db)

let graph: [(Book, Author?)] = try Book.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try oldBooks.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try orderedBooks.leftJoined(with: Book.author).fetchAll(db)
let graph: [(Book, Author?)] = try Book.leftJoined(with: twentiethAuthor).fetchAll(db)

let book: Book = ...
let author: Author? = book.fetchOne(db, Book.author)


// HasOne

struct Country {
    static let profile = hasOne(CountryProfile.self)
    ...
}

struct CountryProfile {
    let countryId: Int64?
    ...
}

let europeanCountries = Country.filter(Column("continent") == "Europe")
let orderedCountries = Country.order(Column("name")
let euroCurrencyProfile = Country.profile.filter(Column("currency") == "EUR")

let graph: [(Country, CountryProfile)] = try Country.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try europeanCountries.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try orderedCountries.joined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile)] = try Country.joined(with: euroCurrencyProfile).fetchAll(db)

let graph: [(Country, CountryProfile?)] = try Country.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try europeanCountries.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try orderedCountries.leftJoined(with: Country.profile).fetchAll(db)
let graph: [(Country, CountryProfile?)] = try Country.leftJoined(with: euroCurrencyProfile).fetchAll(db)

let country: Country = ...
let profile: CountryProfile? = country.fetchOne(db, Country.profile)


// HasMany

struct Author {
    static let books = hasMany(Book.self)
    ...
}

struct Book {
    let authorId: Int64?
    ...
}

let twentiethAuthors = Author.filter(Column("birthYear") >= 1900)
let orderedAuthors = Author.order(Column("name")
let oldBooks = Author.books.filter(Column("year") < 2000)

let graph: [(Author, Book)] = try Author.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try twentiethAuthors.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try orderedAuthors.joined(with: Author.books).fetchAll(db)
let graph: [(Author, Book)] = try Author.joined(with: oldBooks).fetchAll(db)

let graph: [(Author, Book?)] = try Author.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try twentiethAuthors.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try orderedAuthors.leftJoined(with: Author.books).fetchAll(db)
let graph: [(Author, Book?)] = try Author.leftJoined(with: oldBooks).fetchAll(db)

let graph: [(Author, [Book])] = try Author.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try twentiethAuthors.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try orderedAuthors.including(Author.books).fetchAll(db)
let graph: [(Author, [Book])] = try Author.including(oldBooks).fetchAll(db)

let graph: [(Author, Int)] = try Author.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try twentiethAuthors.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try orderedAuthors.annotated(with: Author.books.count).fetchAll(db)
let graph: [(Author, Int)] = try Author.annotated(with: oldBooks.count).fetchAll(db)

let author: Author = ...
let books: [Book] = author.fetchAll(db, Author.books)
let books: DatabaseCursor<Book> = author.fetchCursor(db, Author.books)
let book: Book? = author.fetchOne(db, Author.books)

@smhk
Copy link

smhk commented Jun 15, 2017

Very exciting.

@hartbit
Copy link
Contributor Author

hartbit commented Jun 15, 2017

Can't wait to play with that. Will do soon in my next project.

@groue
Copy link
Owner

groue commented Jun 25, 2017

HasManyAssociation can now filter by number of owned records:

struct Author {
    static let books = hasMany(Book.self)
    ...
}

struct Book {
    let authorId: Int64?
    ...
}

let authors: [Author] = try Author.filter(Author.books.isEmpty).fetchAll(db)
let authors: [Author] = try Author.filter(!Author.books.isEmpty).fetchAll(db)
let authors: [Author] = try Author.filter(Author.books.count == 1).fetchAll(db) // ==, !=, <, <=, >, >=

let currentlyActiveAuthors: [Author] = try Author
    .filter(Author.books.filter(Column("year") > 2010).count > 0)
    .order(Column("name"))
    .fetchAll(db)

When you compose two BelongsTo, HasOne, or HasMany associations, you get HasManyThroughAssociation or HasOneThroughAssociation which can join three tables. For example:

struct Person {
    ...
}

struct Citizenship {
    static let person = belongsTo(Person.self)
    let countryCode: String
    let personId: Int64
    ...
}

struct Country {
    static let citizenships = hasMany(Citizenship.self)
    static let citizens = hasMany(Citizenship.person, through: citizenships) // <- HasManyThroughAssociation
    ...
}

HasManyThroughAssociation has the same API has the regular two-tables HasManyAssociation:

let graph: [(Country, Person)] = try Country.joined(with: Country.citizens).fetchAll(db)
let graph: [(Country, Person?)] = try Country.leftJoined(with: Country.citizens).fetchAll(db)
let graph: [(Country, [Person])] = try Country.including(Country.citizens).fetchAll(db)
let graph: [(Country, Int)] = try Country.annotated(with: Country.citizens.count).fetchAll(db)
let countries: [Country] = try Country.filter(Country.citizens.isEmpty).fetchAll(db)
let persons: [Person] = try country.fetchAll(db, Country.citizens)

@hartbit
Copy link
Contributor Author

hartbit commented Dec 4, 2017

Hi @groue! I'm going to start a new project soon and plan on using GRDB.swift again. What is the state of the Graph API? I would love to use it, but I'm a bit worried depending on it for a production application if its on an experimental branch that is lagging behind master. Anything I can do to help?

@groue
Copy link
Owner

groue commented Dec 4, 2017

Hello @hartbit, glad to hear from you!

I'm going to start a new project soon and plan on using GRDB.swift again.

Couldn't be more honored.

What is the state of the Graph API? I would love to use it, but I'm a bit worried depending on it for a production application if its on an experimental branch that is lagging behind master.

It's in great shape, but it's very lacking in terms of documentation, and it hasn't been rebased since v2.0.3 (when we're at v2.4.0 already).

I dislike using tests as doc, but today this is the only available material (files that contain HasOne, HasMany, or BelongsTo in their names inside https://github.com/groue/GRDB.swift/tree/Graph/Tests/GRDBTests).

Now the bad news: I'm willing to kill this experiment, and not release it. I have been trying to write a rationale and share it in this issue, but couldn't shape my ideas well enough. Now your question kicks me out of the closet:

  1. The Graph API is an experiment, and it will never be able to become more than that until Swift supports some kind of variadic generics (and that'll be very very long).

  2. I don't feel comfortable breaking core values of GRDB that has served its users so well: progressive disclosure and graceful adaptation to application complexity. The Graph API is limited to a few requests that involve two or three tables. As soon as you need something outside of that box, you're left totally naked.

  3. No app I personnaly develop would use the Graph API. My real life graphs are always more complex that simple 1-1, 1-N, N-N relations.

To put it in another way: the Graph API had me think more about the topic of object graphs, until I now see it as a wrong solution to a badly expressed problem :-)

@Sroka
Copy link

Sroka commented Dec 4, 2017

Hi guys,

Thank you for really thinking it through. The topic is extremely non-trivial and I really hate how most ORMs handle that.

I am working as both Android and iOS developer and recently in Android we got official google library to handle SQLite. I find it quite decent and you could take a look at how they handle Relations, maybe it will give you some new ideas:
https://developer.android.com/reference/android/arch/persistence/room/Relation.html

May I ask how do you handle many to one and one to many relations right now with GRDB? Are you making multiple calls to database or make join and then segregate data?

@groue
Copy link
Owner

groue commented Dec 4, 2017

Hello @Sroka, thanks for the link to the Android relations. There is indeed an overlap with the Graph API.

May I ask how do you handle many to one and one to many relations right now with GRDB? Are you making multiple calls to database or make join and then segregate data?

You mean, how do I handle them in my applications?

  • When I need a join, I write it explicitely, in SQL, and usually load a dedicated model. This kind of code rarely has to be reused, so it is defined as private code in view controllers or view models. Below a trivial example (unfortunately, it never happens that the applications I develop are that trivial):

    class AuthorsViewController {
        var items: [AuthorItem]
        
        override func viewDidLoad() {
            super.viewDidLoad()
            items = try! dbQueue.inDatabase {
                try fetchItems()
            }
        }
        
        private func fetchItems() throws -> [AuthorItem] {
            return try AuthorItem.fetchAll(db, """
                SELECT authors.*, COUNT(books.id) AS bookCount
                FROM authors
                LEFT JOIN books ON books.authorId = authors.id
                ORDER BY authors.name
                """)
        }
        
        private struct AuthorItem: RowConvertible {
            let author: Author
            let bookCount: Int
            
            init(row: Row) {
                author = Author(row: row)
                bookCount = row["bookCount"]
            }
        }
    }

    With the Graph API, this would read:

    // Graph API
    items = try! dbQueue.inDatabase {
        try Author
            .annotated(with: Author.books.count)
            .order(Author.Columns.name)
            .fetchAll(db) // [(left: Author, right: Int)]
    }
  • Sometimes I really need to handle a whole graph of objects in memory. When this happens, I define a dedicated type that wraps plain records. Below, a simple example that does not involve joins:

    struct AuthorInfo {
        let author: Author
        let books: [Book]
        let awards: [Award]
        
        static func fetchOne(_ db: Database, authorId: Int64) throws -> AuthorInfo? {
            guard let author = try Author.fetchOne(db, key: authorId) else {
                return nil
            }
            let books = try Book.filter(authorId: author.id).fetchAll(db)
            let awards = try Award.filter(authorId: author.id).fetchAll(db)
            return AuthorInfo(author: author, books: books, awards: awards)
        }
    }

    In the sample code above, the Graph API has no use.

That's my current life with GRDB. Without any Graph API, I unfortunately have to write dedicated code. Without the Graph API, I enjoy the fact that both trivial and complex cases look almost the same, and that when a trivial case turns out to be not so trivial, there is not much refactoring or mental shift involved. That last point is at the core of my embarrassment publishing the Graph API, an API that does not scale well with application complexity.

@Sroka
Copy link

Sroka commented Dec 5, 2017

@groue Thank you very much for detailed explanation. I must say that I really like this conservative approach. While being verbose (and as someone with Java background I am quite used to verbosity :P ) it is easy to tell what will be the result of a query. If it comes to Graph API it still looks nice and could be used for simpler cases. As an alternative, if someone is using RxGRDB, results of joins can be easily aggregated with groupBy operator

@groue
Copy link
Owner

groue commented Dec 6, 2017

If it comes to Graph API it still looks nice and could be used for simpler cases.

Yes. But as I told above, there is a huge gap between simple cases and just-slightly-less-simple cases. And it's very difficult to know, in advance, if your case is simple enough or not.

Given a task, it can easily happen that you invest time feeding the Graph API with association declarations, only to eventually find out that this is a dead-end. There are two ways out of this dead-end: stop using the Graph API, or extending the Graph API with the missing features.

Extending the Graph API is not only difficult in terms of implementation (but I'm responsible for that because of the current state of the code), but also excruciatingly difficult in terms of consistency. Does the extension support filtering? ordering? grouping? counting? recursive associations? left and right joins? support for raw sql snippets? custom table aliases required by some raw sql snippets? natural combination with other Graph API feature?

To make you understand this last point, consider just a few of the many features you get with a simple hasMany relation:

struct Book: TableMapping, RowConvertible {
    static let databaseTableName = "books"
}

struct Author: TableMapping, RowConvertible {
    static let databaseTableName = "authors"
    static let books = hasMany(Book.self)
}

Including Requests:

// SELECT * FROM \"authors\";
// SELECT * FROM \"books\" WHERE (\"authorId\" IN (...))
//
// Returns [(left: Author, right: [Book])]
Author
    .including(Author.books)
    .fetchAll(db)

Annotation Requests:

// SELECT "authors".*, COUNT("books"."id")
// FROM "authors"
// LEFT JOIN "books" ON ("books"."authorId" = "authors"."id")
// GROUP BY "authors"."id"
//
// Returns [(left: Author, right: Int)]
Author
    .annotated(with: Author.books.count)
    .fetchAll(db)

Annotation predicates:

// SELECT "authors".*
// FROM "authors"
// LEFT JOIN "books" ON ("books"."authorId" = "authors"."id")
// GROUP BY "authors"."id"
// HAVING (COUNT("books"."id") > 0)
//
// Returns [Author]
Author
    .filter(Author.books.count > 0)
    .fetchAll(db)

And that's only a tiny tiny window on the hasMany association.

A combinatorial explosion happens when you combine Graph API features together. The current belongsTo, hasMany, hasOne, hasManyThrough and hasOneThrough associations are covered by hundreds of tests (ok, maybe not, but really a lot) that check a heck a lot of combinations, details, and corner cases.

This means that I don't expect any future PR to sustain this level of consistency. Consistency is a quality that is mandatory whenever you hope that users can build a regular and simple mental model of any API. There are reasons why some people like GRDB, and the fact that it appears easy to grasp is one of the key reasons. The fact that it looks easy is a consequence of its inner consistency, which allows exploring rooms where expected doors are there, open and ready to be crossed.

As an alternative, if someone is using RxGRDB, results of joins can be easily aggregated with groupBy operator

I'm not sure I understand. Would you care to elaborate?

@groue
Copy link
Owner

groue commented Dec 6, 2017

By the way, this is the second failed attempt at dealing with joins (previous was #103). I'm amazed by the level of maturity required by this topic 😅, and also by the amount of work (code + documentation) I need until a solution eventually appears as a dead-end.

I don't know yet how next attempt will look like!

@hartbit
Copy link
Contributor Author

hartbit commented Dec 6, 2017

I can't refrain from feeling a little bit disappointed that we have reached a dead-end for now. But I'm reassured that you're taking the time to make sure we end up with something that is well designed and thought out. 👍

Let me make a few remarks:

Given a task, it can easily happen that you invest time feeding the Graph API with association declarations, only to eventually find out that this is a dead-end. There are two ways out of this dead-end: stop using the Graph API, or extending the Graph API with the missing features.

This already happens to me today, but with the current Query interface. I start using the Query interface because I need a simple select, for example a list of authors. Then the design team comes to me asking that we add the number of books from the author on the cell, and I have reached a dead end that forces me to stop using the Query interface and revert to pure SQL.

Perhaps the solution is to accept that the join API should not handle all the cases/complexity. But if it could handle some of the simple cases, it would drastically reduce the number of times I have to resort to pure SQL. And while I love using SQL, and I admit that its the only solution for truly complicated queries, the loss of compile-time checking for simple cases feels a pity.

To give you an idea, I've taken the time to categories all custom SQL queries I do in my latest big app and tried to guess which are simple enough that they could be handled by a simple API. I've renamed sensitive table and field names for privacy reasons, but apart from that, those are real queries I use.

Once we everything up, a simple API like the one you have built would reduce my number of custom queries from 26 to 4, which would be huge for me.

Please let me know what you think.

Annotation predicates (15 queries)

Most of my custom queries are those. I need to join only to go test a predicate of a child/grand-child table. For example:

// SELECT reports.*
// FROM reports
// INNER JOIN teams ON teams.reportId = reports.id
// INNER JOIN teamMembers ON teamMembers.teamId = teams.id
// WHERE reports.isDeleted = 0
// AND reports.projectId = :projectId
// AND teamMembers.employeeId = :employeeId
// GROUP BY reports.id
//
// Returns [Report]
Report
    .filter(!Report.isDeleted && Report.projectId == projectId)
    .filter(Report.teams.teamMembers.employeedId == employeeId)
    .fetchAll(db)

Question: Is the example above simple to support? We need to traverse two joins to get to the field which the filter predicate is based on.

Annotation requests (7 queries)

I have a bit less of those. For example:

// SELECT bookId, max(chapters.lastUpdated) AS lastUpdated
// FROM books
// LEFT JOIN chapters ON books.id = chapters.bookId
// GROUP BY bookId
//
// Returns [(Int, Date)]
Book
    .select(Book.id)
    .annotated(with: max(Book.chapters.lastUpdated))
    .fetchAll(db)

Complex queries (4 queries)

Those would be a mess to do without raw SQL. I'm okay with that. For example:

SELECT
    libraries.*,
    books.*,
    group_concat(DISTINCT authors.name) AS authorNames,
    count(chapters.id OR
          preludes.id OR
          appendixes.id) AS totalChapterCount,
    IFNULL(books.id, RANDOM()) as groupId
FROM libraries
LEFT JOIN books ON libraries.id = books.libraryId
LEFT JOIN bookAuthors ON books.id = bookAuthors.bookId
LEFT JOIN authors ON bookAuthors.authorId = authors.id
LEFT JOIN chapters ON books.id = chapters.chapterId
LEFT JOIN preludes ON books.id = preludes.bookId
LEFT JOIN appendixes ON books.id = appendixes.bookId
WHERE libraries.id = :libraryId
GROUP BY groupId

@smhk
Copy link

smhk commented Dec 6, 2017

(It's funny that this all started with the humble phrase "Joins in the Query Interface" !)

"Extending the Graph API is not only difficult in terms of implementation (but I'm responsible for that because of the current state of the code), but also excruciatingly difficult in terms of consistency. Does the extension support filtering? ordering? grouping? counting? recursive associations? left and right joins? support for raw sql snippets? custom table aliases required by some raw sql snippets? natural combination with other Graph API feature?"

Just one man's opinion:

I completely agree with Groue.

I am pleased that the 'graph' experiment has been set aside.

It's too conceptually messy - I would go further and say: it is a poorly founded idea - to try to map the almost natural-language like flexibility of SQL joins, to the structure of GRDB.

I would phrase the above more simply, "Extending the Graph API is difficult in implementation, but also difficult conceptually." It's just not an elegant or well-founded "thing". The overall graph experiment was (I think) an investigation in to that idea: it was not "version one of a solution".

So where I mention above, "it is a poorly founded idea", the fuller expression there is; the whole topic (of "incorporating broader SQL") is so subtle and complex, it took this much investigation to learn what we now know: its' a poorly founded idea that will never be consistent and conceptually sound (setting aside implementation difficulties, and that's a big one!)

(Thus for example, hartbit, looking at your three categories of example sql there (15/7/4), I personally don't really see anything that immediately jumps out which suggests - let us put it this way - weakening the GRDB concept by adding "a little" special handling.) (For example, the slightest change in the business needs of those queries, would wildly change the "categories" you know? I don't really see a real, safe, meaningful "simpler set" - one view!)

(Interestingly, I personally came to GRDB for the specific reason that you can let us say inherently and naturally use "full sql when you need to" in GRDB; the other older libraries fall-down in this. To me - again it can only be one man's opinion - it would be a false/inelegant goal for a Swift SQL library to think that all sql can be re-presented.)

@hartbit
Copy link
Contributor Author

hartbit commented Dec 6, 2017

A few counter-arguments:

It's too conceptually messy - I would go further and say: it is a poorly founded idea - to try to map the almost natural-language like flexibility of SQL joins, to the structure of GRDB.

I think there is a big difference between the goals of mapping the whole flexibility of the SQL syntax and mapping only the simple cases. That's the whole reason the Query interface exists today: it's a nicer interface for the simple cases. There are two extremes: no interface whatsoever, always full raw SQL, which is rather inconvenient, and at the opposite extreme, trying to map the whole SQL language, which would make for a complicated mess. All this to say that I don't disagree with the premise, that mapping the flexibility of joins is too messy, but that we don't need to map the full flexibility of joins to support some fairly common use cases.

(Thus for example, hartbit, looking at your three categories of example sql there (15/7/4), I personally don't really see anything that immediately jumps out which suggests - let us put it this way - weakening the GRDB concept by adding "a little" special handling.) (For example, the slightest change in the business needs of those queries, would wildly change the "categories" you know? I don't really see a real, safe, meaningful "simpler set" - one view!)

I have the same problem today when I start with a simple Query interface request and have to go one level up in complexity, forcing me to rewrite the query in raw SQL. I'm just advocating for the barrier to be slightly higher and support a few common join operations.

(Interestingly, I personally came to GRDB for the specific reason that you can let us say inherently and naturally use "full sql when you need to" in GRDB; the other older libraries fall-down in this. To me - again it can only be one man's opinion - it would be a false/inelegant goal for a Swift SQL library to think that all sql can be re-presented.)

I also came to GRDB for the power to express "full sql when you need to". I think we just disagree on when we need to 😄

@groue
Copy link
Owner

groue commented Dec 6, 2017

Thanks @hartbit and @smhk for your detailed answers. See you soon with an update.

@groue
Copy link
Owner

groue commented Feb 22, 2018

Hello folks, here is some update!

Since building joined queries is so delicate, I've changed my focus a little bit, on the consumption of joined queries.

https://github.com/groue/GRDB.swift/blob/development/README.md#joined-queries-support

I don't expect those APIs to be very controversial, since they extend existing GRDB techniques, and attempts at standardizing a safe way to deal with complex SQL queries. Ha, and we can now write nested Decodable records 😄

@groue groue removed the help wanted Extra attention is needed label Mar 6, 2018
@groue
Copy link
Owner

groue commented Mar 12, 2018

Obsoleted by #319

@groue groue closed this as completed Mar 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants