Remove the need for primary key column when embedding parents in &select #411

ruslantalpa · 2015-12-08T07:45:04Z

(opened this issue so that i don't forget the bug, and i have a ID for the issue)

begriffs · 2016-04-03T19:33:30Z

@ruslantalpa is this issue still relevant?

ruslantalpa · 2016-04-04T09:34:12Z

Yes this is still an issue

http://localhost:3000/projects?select=*,client{name}

returns this

{
  "hint": null,
  "details": null,
  "code": "42703",
  "message": "column clients.id does not exist"
}

steve-chavez · 2017-09-13T23:25:08Z

This happens only in Parent embeds because when detecting this relation PostgREST executes a different sql query(doing explicit joins) than the queries used for the Child and Many to Many embeds(they use implicit joins), a way I see to fix this is to change the Parent query to a query similar to the other ones, with implicit joins, so it's basically like this:

-- Current query generated by GET /projects?select=*,clients(id,name)
WITH pg_source AS (
  SELECT "test"."projects".*,
          row_to_json("clients_clients".*) AS "clients"
  FROM "test"."projects"
  LEFT OUTER JOIN (
    SELECT "test"."clients"."id",
           "test"."clients"."name"
    FROM "test"."clients"
  ) AS "clients_clients" ON "clients_clients"."id" = "projects"."client_id")
SELECT NULL AS total_result_set,
       pg_catalog.count(_postgrest_t) AS page_total, 
       array[]::text[] AS header,
       coalesce(array_to_json(array_agg(row_to_json(_postgrest_t))), '[]')::character varying AS BODY
FROM ( SELECT * FROM pg_source) _postgrest_t;


-- Proposed new query with implicit joins
WITH pg_source AS (
  SELECT "test"."projects".*,
          (SELECT row_to_json("clients")
           FROM (
             SELECT "test"."clients"."id",
                    "test"."clients"."name"
             FROM "test"."clients"
             WHERE "test"."clients"."id" = "test"."projects"."client_id" 
           ) "clients"
          ) AS "clients"
  FROM "test"."projects") 
SELECT NULL AS total_result_set,
       pg_catalog.count(_postgrest_t) AS page_total, 
       array[]::text[] AS header,
       coalesce(array_to_json(array_agg(row_to_json(_postgrest_t))), '[]')::character varying AS body
FROM ( SELECT * FROM pg_source) _postgrest_t;

They both give the same result:

[{"id":1,"name":"Windows 7","client_id":1,"clients":{"id":1,"name":"Microsoft"}},{"id":2,"name":"Windows 10","client_id":1,"clients":{"id":1,"name":"Microsoft"}},{"id":3,"name":"IOS","client_id":2,"clients":{"id":2,"name":"Apple"}},{"id":4,"name":"OSX","client_id":2,"clients":{"id":2,"name":"Appl
e"}},{"id":5,"name":"Orphan","client_id":null,"clients":null}]

But the new query can be used without the id in the querystring like in
GET projects/?select=name,clients(name), look at the sql query to see why.

The drawback of this new query is that is more expensive, when doing an EXPLAIN the costs are:

Current query with explicit joins: cost=146.38..146.40
New query with implicit joins:cost=15526.90..15526.92

However the Child and Many to Many embeds also generate high cost queries:

-- Many to many embed
GET /tasks?select=id,users{id}
cost=16838.48..16838.49

-- Child embed
GET /clients?select=*,projects{name}
cost=42793.13..42793.14

So as a first step towards solving this issue I'm gonna change the Parent embed to the new proposed query and make the tests pass, this I think would allow for a simplification of the requestToQuery function and then I'll see how all the queries can be accommodated to use explicit joins.

ruslantalpa · 2017-09-13T23:37:17Z

i am completley against that.
having to request and aditional filed is a minor inconvinience.
a drop in performance by 100 times (2 orders of magnitude) is a big problem.

the direction should be how do we make child embeds faster, not the other way arround.

it would be mich better to make it so that if the user requests a parent without an id, we automatically add the id to the select so that we do not generate a bad query, and when the user wonders why did i get the id, the docs would say that it's necessary for the query.
a hike in cost from 150 to 15000 is unacceptable

just because some cases of the embeds have worse performance, does nto amke it acceptable to drag down the other cases when it's fast

steve-chavez · 2017-09-14T00:48:48Z

I mentioned that the complete solution will have all the embedding queries doing explicit joins(lowest costs).

ruslantalpa · 2017-09-14T01:00:59Z

but dont we have that right now with parrent embedding, and in this type of join, you need the id,
and seccond, i don't think you can do explicit joins for everything because that would be a "flat" thing, so there is no tree like structure/recursive nature to the query which you need.

I mean you are welcome to try :) and even if you are doing this for one level (up or down) think how it will work going three levels down.

begriffs added the bug label Apr 4, 2016

manuelmarichal mentioned this issue Oct 14, 2016

Can't rename primary key column of related table when embedding data #719

Closed

steve-chavez self-assigned this Sep 13, 2017

steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Sep 16, 2017

Fix PostgREST#411, remove the need for pk in &select for parent embed

a14e2f4

steve-chavez mentioned this issue Sep 16, 2017

Remove the need for pk in &select for parent embed #978

Merged

steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Sep 17, 2017

Fix PostgREST#411, remove the need for pk in &select for parent embed

5849a00

steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Nov 27, 2017

Fix PostgREST#411, remove the need for pk in &select for parent embed

d24f598

steve-chavez closed this as completed in fef7d94 Nov 30, 2017

monacoremo pushed a commit to monacoremo/postgrest that referenced this issue Jul 17, 2021

Fix PostgREST#411, remove the need for pk in &select for parent embed

22fd36f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the need for primary key column when embedding parents in &select #411

Remove the need for primary key column when embedding parents in &select #411

ruslantalpa commented Dec 8, 2015

begriffs commented Apr 3, 2016

ruslantalpa commented Apr 4, 2016

steve-chavez commented Sep 13, 2017

ruslantalpa commented Sep 13, 2017 •

edited

Loading

steve-chavez commented Sep 14, 2017

ruslantalpa commented Sep 14, 2017

Remove the need for primary key column when embedding parents in &select #411

Remove the need for primary key column when embedding parents in &select #411

Comments

ruslantalpa commented Dec 8, 2015

begriffs commented Apr 3, 2016

ruslantalpa commented Apr 4, 2016

steve-chavez commented Sep 13, 2017

ruslantalpa commented Sep 13, 2017 • edited Loading

steve-chavez commented Sep 14, 2017

ruslantalpa commented Sep 14, 2017

ruslantalpa commented Sep 13, 2017 •

edited

Loading