When building database-backed applications you'll often find yourself wanting to return a row from the database along with its related rows.
A few examples:
- Retrieving a list of congressional legislators and their terms, following a foreign key relationship
- Return blog entries and their tags in one go, via a many-to-many table
You can do this in SQLite using the json_group_array() aggregation function. A couple of examples.
Simplified schema for this database:
CREATE TABLE [legislators] (
[id] TEXT PRIMARY KEY,
[name] TEXT,
[bio_birthday] TEXT
);
CREATE TABLE [legislator_terms] (
[legislator_id] TEXT REFERENCES [legislators]([id]),
[type] TEXT,
[state] TEXT,
[start] TEXT,
[end] TEXT,
[party] TEXT
);
Here's a query that returns each legislator along with a JSON array of their terms:
select
legislators.id,
legislators.name,
json_group_array(json_object(
'type', legislator_terms.type,
'state', legislator_terms.state,
'start', legislator_terms.start,
'end', legislator_terms.end,
'party', legislator_terms.party
)) as terms,
count(*) as num_terms
from
legislators join legislator_terms on legislator_terms.legislator_id = legislators.id
group by legislators.id
order by
id
limit
10
And the result:
Note that this query does group by legislators.id
which is allowed in SQLite but may not work in other databases, which might require group by legislators.id, legislators.name
instead.
Simplified schema:
CREATE TABLE [blog_entry] (
[id] INTEGER PRIMARY KEY,
[title] TEXT
);
CREATE TABLE [blog_tag] (
[id] INTEGER PRIMARY KEY,
[tag] TEXT
);
CREATE TABLE [blog_entry_tags] (
[id] INTEGER PRIMARY KEY,
[entry_id] INTEGER,
[tag_id] INTEGER,
FOREIGN KEY([entry_id]) REFERENCES [blog_entry]([id]),
FOREIGN KEY([tag_id]) REFERENCES [blog_tag]([id])
);
Query to retrieve entries with their tags:
select
blog_entry.id,
blog_entry.title,
json_group_array(json_object('tag', blog_tag.tag)) as tags
from
blog_entry
join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
join blog_tag on blog_tag.id = blog_entry_tags.tag_id
group by
blog_entry.id
order by
blog_entry.id desc
id | title | tags |
---|---|---|
8191 | I don't know how to solve prompt injection | [{"tag":"ai"},{"tag":"security"},{"tag":"openai"}] |
8190 | Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more | [{"tag":"shotscraper"},{"tag":"datasette"},{"tag":"plugins"},{"tag":"datasettelite"},{"tag":"projects"},{"tag":"s3credentials"},{"tag":"weeknotes"}] |
8189 | Prompt injection attacks against GPT-3 | [{"tag":"ai"},{"tag":"gpt3"},{"tag":"security"},{"tag":"openai"}] |
There's a subtle bug in the above: if an entry has no tags at all it will be excluded from the query results entirely.
You can fix that using left joins like this:
select
blog_entry.id,
blog_entry.title,
json_group_array(json_object('tag', blog_tag.tag)) as tags
from
blog_entry
left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
left join blog_tag on blog_tag.id = blog_entry_tags.tag_id
where blog_entry.id < 4
group by
blog_entry.id
order by
blog_entry.id desc
This almost works, but it outputs the following returning {"tag": null}
for entries with no tags:
id | title | tags |
---|---|---|
3 | Todo list | [{"tag":null}] |
2 | Blogging aint easy | [{"tag":null}] |
1 | WaSP Phase II | [{"tag":null}] |
David Fetter showed me the solution:
select
blog_entry.id,
blog_entry.title,
json_group_array(
json_object('tag', blog_tag.tag)
) filter (
where
blog_tag.tag is not null
) as tags
from
blog_entry
left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
left join blog_tag on blog_tag.id = blog_entry_tags.tag_id
group by
blog_entry.id
order by
blog_entry.id
That extra filter on the aggregation does the trick!
Other databases are capable of the same thing, but using different functions. PostgreSQL has json_agg() for example, which is also available in Django as JSONBAgg.
Here's an equivalent query in PostgreSQL syntax:
select
blog_entry.id,
title,
slug,
created,
coalesce(json_agg(json_build_object(blog_tag.id, blog_tag.tag)) filter (
where
blog_tag.tag is not null
), json_build_array()) as tags
from
blog_entry
left join blog_entry_tags on blog_entry.id = blog_entry_tags.entry_id
left join blog_tag on blog_entry_tags.tag_id = blog_tag.id
group by
blog_entry.id
order by
blog_entry.id
See that running here in django-sql-dashboard
.