This post on Reddit asked how to run a query that takes a list of items (in this case ingredients) as the input and returns all rows with at least one of those items in a JSON list, ordered by the most matches.
I jumped on this as a great opportunity to demonstrate Datasette Lite as a tool for answering SQL questions.
I started with the following SQL to create the demo table:
create table recipes (id integer primary key, name text, ingredients text);
insert into recipes values (1, 'Cake', '["flour", "sugar", "eggs"]');
insert into recipes values (2, 'Pancakes', '["flour", "eggs", "butter"]');
insert into recipes values (3, 'Waffles', '["flour", "milk", "eggs"]');
insert into recipes values (4, 'Pizza', '["flour", "sugar", "eggs", "cheese"]');
I actually got GitHub Copilot to write most of that for me:
I saved that to a gist and opened it in Datasette Lite with this URL:
Here's the SQL recipe I came up with to solve the question:
select id, name, ingredients, (
select json_group_array(value) from json_each(ingredients)
where value in (select value from json_each(:p0))
) as matching_ingredients
from recipes
where json_array_length(matching_ingredients) > 0
order by json_array_length(matching_ingredients) desc
Example of that query with an input for p0
of ["sugar", "cheese"]
, which returns:
id | name | ingredients | matching_ingredients |
---|---|---|---|
4 | Pizza | ["flour", "sugar", "eggs", "cheese"] | ["sugar","cheese"] |
1 | Cake | ["flour", "sugar", "eggs"] | ["sugar"] |
The key trick here is the bit that creates that matching_ingredients
column, which uses a sub-select like this:
select json_group_array(value) from json_each(ingredients)
where value in (select value from json_each(:p0))
json_group_array(value)
is an aggregation function that turns the results into a JSON array.
where value in (select value from json_each('["sugar", "cheese"]')
is the bit that calculates the intersection of the two JSON arrays.
Then at the end I use json_array_length()
to remove rows with no overlap and then sort with the most matching ingredients first.