Hierarchy Implementation Proposal #793

temmey · 2023-08-05T15:07:54Z

Basics

I added a line to /doc/CHANGELOG.md
The PR is rebased with current master.
Details of what you changed are in commit messages.
References to issues, e.g. close #X, are in the commit messages.
The buildserver is happy.

Checklist

Review

I've tested the code
I've read through the whole code
Examples are well chosen and understandable

markus2330

In general I like the idea but it looks like it doesn't fulfill the requirements?

doc/decisions/database_plant_hierarchy.md

markus2330 · 2023-08-05T16:06:07Z

doc/decisions/database_plant_hierarchy.md

+Cons:
+
+- Attribute overrides can only be done on the variety or cultivar level.
+- More complex insert and update logic.


But it would be hidden behind triggers? In general this sounds like a good idea, as we then don't need to keep Rust, maybe Python (E2E) and JavaScript code (scraper) in sync.

Yes, and the view would act just like our plant table right now, we only extend it with some more properties.

markus2330 · 2023-08-06T17:30:16Z

I like the idea more and more. Remaining questions:

What about backend compilation time? From rust we basically only see the view, don't we?

How is the performance?

How will we do the import? Will we have several CSVs for each hierarchy level?

temmey · 2023-08-06T23:02:16Z

What about backend compilation time? From rust we basically only see the view, don't we?

With Rust, we would handle it as a table. We need to adjust the schema.rs with the schema.patch file.
See: https://deterministic.space/diesel-view-table-trick.html for an example.
The other tables would also be generated in the schema.rs.
We could remove them for now with the patch file, but in the long term, we may want to build functionality that requires these tables.

How is the performance?

😬 Well, every insert/update in the plants table would need to check up to 5 other tables with these functions.
I am not experienced enough to make a prediction about how the performance would look.
But since plants insert/update, as far as I know, is not something which would be excessively used except while importing, I don't think it will be a huge factor.

How will we do the import? Will we have several CSVs for each hierarchy level?

Since we want to replace the insert functions, our current implementation should work. It may be slower, but see my answer above.

markus2330 · 2023-08-07T06:24:51Z

How is the performance?

I mean for what we do at runtime: searching plants etc.

markus2330 · 2023-08-07T06:37:03Z

To bring it shorter to an end: I think there are only two options: either like you imagine or with everything in plants (keys point to other rows in the same table). Can you please write the pro/cons between these two in the decision?

temmey · 2023-08-14T19:35:36Z

For now, I implemented it following this hierarchy.

This increases complexity as some plants are cultivars, others are varieties.
Cultivars can inherit from species and varieties.
While solvable with my proposal, is this hierarchy necessary?
We currently identify cultivars only using ''. Is this correct? For example, in 'Solanum lycopersicum 'Lemon drop',' 'Lemon drop' is the cultivar.

badnames

Looks good thus far.
As usual here are a few things I noticed while reading over your changes.

badnames · 2023-08-19T16:01:22Z

doc/decisions/database_plant_hierarchy.md

@@ -34,6 +34,8 @@ See the [PSQL documentation](https://www.postgresql.org/docs/current/ddl-inherit

 > Table inheritance is typically established when the child table is created, using the INHERITS clause of the CREATE TABLE statement.

+> Rust Diesel isn't intended for that. To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.


Suggested change

> Rust Diesel isn't intended for that. To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.

> Rust Diesel isn't intended for that.

> To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.

Sentences need to be written on separate lines according to our documentation guidelines.

badnames · 2023-08-19T16:02:12Z

doc/decisions/database_plant_hierarchy.md

+
+[Example](example_migrations/one-table-per-taxonomy-view-functions)
+
+It's similar to `One table for taxonomy ranks and one for concrete plants` We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.


Suggested change

It's similar to `One table for taxonomy ranks and one for concrete plants` We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.

It is similar to `One table for taxonomy ranks and one for concrete plants`.

We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.

badnames · 2023-08-19T16:03:14Z

doc/decisions/database_plant_hierarchy.md

@@ -85,6 +87,46 @@ Cons:
 - Almost everything in the plants table needs to be nullable.
 - More complex insert and update logic.

+### One table per taxonomy rank and one for concrete plants. + View and custom insert/update/delete functionality


This title is a bit confusing in my opinion.

badnames · 2023-08-19T16:04:23Z

doc/decisions/database_plant_hierarchy.md

+Pros:
+
+- Inserting new plants is easy. We only need to implement minor backend changes.
+- Properties overrides can be done on every level.


Suggested change

- Properties overrides can be done on every level.

- Property overrides can be done on every level.

badnames · 2023-08-19T16:12:46Z

doc/decisions/database_plant_hierarchy.md

+  Only if we can't find a match, the value should be written.
+  - We can offset this issue by implementing insert/update functions.
+    Since they are going to be complicated, long-term maintainability may be an issue.
+- Almost everything in the plants and parent tables needs to be nullable. (Is this a downside?)


Yes I think so. If everything is nullable (even if this goes against our data model) we need to perform more manual integrity checks.

badnames · 2023-08-19T16:15:13Z

...le_migrations/one-table-per-taxonomy-view-functions/2023-03-09-194135_plant_relations/up.sql

+-- COALESCE function accepts an unlimited number of arguments.
+-- It returns the first argument that is not null.
+
+--todo


Please use English everywhere, even if you just leave notes for yourself.
This would help non german speaking team members to gauge your progress more easily when reviewing draft PRs.

chr-schr

Overall I like the proposal if overriding plant properties on every rank is something that we need.
Select performance would be great if we use a materialized view. Currently we only insert our plants offline and never update them, so the materialized view would never need to be refreshed.
Or are we expecting that to change in the near future? (Inserting/updating plants live)

As far as I understand, maintainability will suffer a bit. Adding a new plant property would then additionally involve:

Adding a new column to all plant rank tables (families, genera, species, varieties, cultivars)
Updating the plants_view to include the new column
(Maybe) updating the insert / update trigger functions to support the new column

first version of the proposal

7fbab8b

temmey changed the title ~~first version of the proposal~~ Hierarchy Implementation Proposal Aug 5, 2023

temmey mentioned this pull request Aug 5, 2023

Plant Hierarchy Implementation Proposal #764

Open

6 tasks

markus2330 suggested changes Aug 5, 2023

View reviewed changes

clearified ability to override properties

93c660a

This was referenced Aug 11, 2023

cleanup of old doc migration files #812

Closed

757 sql reformatting finetune sqlfluff #765

Merged

temmey added 4 commits August 13, 2023 17:30

Merge branch 'master' into plant-hierarchy-implementation-proposal-764

7532f7d

Merge branch 'master' into plant-hierarchy-implementation-proposal-764

d28a3e0

Clarified proposal vs current solution

f61fd59

added testdata as an example

f64f7ad

badnames reviewed Aug 19, 2023

View reviewed changes

Merge branch 'master' into plant-hierarchy-implementation-proposal-764

991a00f

markus2330 requested a review from chr-schr February 19, 2024 09:45

markus2330 assigned chr-schr Mar 30, 2024

chr-schr reviewed Apr 14, 2024

View reviewed changes

chr-schr linked an issue Apr 16, 2024 that may be closed by this pull request

Plant Hierarchy Implementation Proposal #764

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hierarchy Implementation Proposal #793

Hierarchy Implementation Proposal #793

temmey commented Aug 5, 2023

markus2330 left a comment

markus2330 Aug 5, 2023

temmey Aug 6, 2023

markus2330 commented Aug 6, 2023

temmey commented Aug 6, 2023 •

edited

Loading

markus2330 commented Aug 7, 2023

markus2330 commented Aug 7, 2023

temmey commented Aug 14, 2023

badnames left a comment

badnames Aug 19, 2023

badnames Aug 19, 2023

badnames Aug 19, 2023

badnames Aug 19, 2023

badnames Aug 19, 2023

badnames Aug 19, 2023 •

edited

Loading

chr-schr left a comment

		@@ -34,6 +34,8 @@ See the [PSQL documentation](https://www.postgresql.org/docs/current/ddl-inherit

		> Table inheritance is typically established when the child table is created, using the INHERITS clause of the CREATE TABLE statement.

		> Rust Diesel isn't intended for that. To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.

	> Rust Diesel isn't intended for that. To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.
	> Rust Diesel isn't intended for that.
	> To only select data from a specific table, and not include all child tables, we would need to use the `FROM ONLY` keyword, which is not implemented in Rust Diesel.


		[Example](example_migrations/one-table-per-taxonomy-view-functions)

		It's similar to `One table for taxonomy ranks and one for concrete plants` We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.

	It's similar to `One table for taxonomy ranks and one for concrete plants` We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.
	It is similar to `One table for taxonomy ranks and one for concrete plants`.
	We are extending it with a view and custom functions to reduce insert and update complexity in the backend and scraper.

	- Properties overrides can be done on every level.
	- Property overrides can be done on every level.

Hierarchy Implementation Proposal #793

Are you sure you want to change the base?

Hierarchy Implementation Proposal #793

Conversation

temmey commented Aug 5, 2023

Basics

Checklist

Review

markus2330 left a comment

Choose a reason for hiding this comment

markus2330 Aug 5, 2023

Choose a reason for hiding this comment

temmey Aug 6, 2023

Choose a reason for hiding this comment

markus2330 commented Aug 6, 2023

temmey commented Aug 6, 2023 • edited Loading

markus2330 commented Aug 7, 2023

markus2330 commented Aug 7, 2023

temmey commented Aug 14, 2023

badnames left a comment

Choose a reason for hiding this comment

badnames Aug 19, 2023

Choose a reason for hiding this comment

badnames Aug 19, 2023

Choose a reason for hiding this comment

badnames Aug 19, 2023

Choose a reason for hiding this comment

badnames Aug 19, 2023

Choose a reason for hiding this comment

badnames Aug 19, 2023

Choose a reason for hiding this comment

badnames Aug 19, 2023 • edited Loading

Choose a reason for hiding this comment

chr-schr left a comment

Choose a reason for hiding this comment

temmey commented Aug 6, 2023 •

edited

Loading

badnames Aug 19, 2023 •

edited

Loading