Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for data modeling #222

Closed
kgodey opened this issue Jun 9, 2021 · 3 comments
Closed

Design for data modeling #222

kgodey opened this issue Jun 9, 2021 · 3 comments
Milestone

Comments

@kgodey
Copy link
Contributor

kgodey commented Jun 9, 2021

Problem

Users may want to alter their data model after they've created tables and have some data. We should allow them to do so.

Implementation details

Imagine the following roster table:

ID Student Name Student Email Class Teacher Name Teacher Email Grade
1 Beverly Crusher [email protected] Exobiology Leonard McCoy [email protected] A+
2 William Riker [email protected] Exobiology Leonard McCoy [email protected] B-
3 Geordi La Forge [email protected] Exobiology Leonard McCoy [email protected] A
4 Beverly Crusher [email protected] Warp Theory Montgomery Scott [email protected] A-
5 William Riker [email protected] Warp Theory Montgomery Scott [email protected] B
6 Geordi La Forge [email protected] Warp Theory Montgomery Scott [email protected] A+

To normalize their data, The user would want to turn this into the following tables:

Step 1: Extract Students

First, they would create a new Students table using the Student Name and Student Email columns.

ID Student Name Student Email
1 Beverly Crusher [email protected]
2 William Riker [email protected]
3 Geordi La Forge [email protected]

The existing roster table would become:

ID Student ID Class Teacher Name Teacher Email Grade
1 1 Exobiology Leonard McCoy [email protected] A+
2 2 Exobiology Leonard McCoy [email protected] B-
3 3 Exobiology Leonard McCoy [email protected] A
4 1 Warp Theory Montgomery Scott [email protected] A-
5 2 Warp Theory Montgomery Scott [email protected] B
6 3 Warp Theory Montgomery Scott [email protected] A+

Step 2: Extract Teachers

Second, they would create a new Teachers table using the Teacher Name column.

ID Teacher Name
1 Leonard McCoy
2 Montgomery Scott

The existing roster table would become:

ID Student ID Class Teacher ID Teacher Email Grade
1 1 Exobiology 1 [email protected] A+
2 2 Exobiology 1 [email protected] B-
3 3 Exobiology 1 [email protected] A
4 1 Warp Theory 2 [email protected] A-
5 2 Warp Theory 2 [email protected] B
6 3 Warp Theory 2 [email protected] A+

Step 3: Move Teacher Email from roster to Teachers

The user realizes they've made a mistake and left Teacher Email on the wrong table. They will then move the Teacher Email column to Teachers.

The Teacher table becomes:

ID Teacher Name Teacher Email
1 Leonard McCoy [email protected]
2 Montgomery Scott [email protected]

The roster table becomes:

ID Student ID Class Teacher ID Grade
1 1 Exobiology 1 A+
2 2 Exobiology 1 B-
3 3 Exobiology 1 A
4 1 Warp Theory 2 A-
5 2 Warp Theory 2 B
6 3 Warp Theory 2 A+

Step 4: Extract Classes and Teacher ID

Now we extract class information into its own table, Classes:

ID Class Teacher ID
1 Exobiology 1
2 Warp Theory 2

The roster table becomes:

ID Student ID Class ID Grade
1 1 1 A+
2 2 1 B-
3 3 1 A
4 1 2 A-
5 2 2 B
6 2 3 A+

Step 5: Rename roster to Grades

The user realizes that roster is now essentially a list of grades, so they rename the table to Grades.

Now they've transformed a table named roster into four tables:

  • Students
  • Teachers
  • Classes
  • Grades

Proposed solution

We need to guide the user through the data modeling process above.

Actions available

Users should be able to:

Create a new table based on moving a column from an existing table.

  • This is the process followed in Steps 1, 2, and 4 in the above example.

Move a column from an existing table to another existing table.

  • This is the process followed in Step 3 in the above example.
  • They could also do this process in reverse to move Teacher Email back to roster
  • We need to handle cases where the data cannot be easily correlated. For example, if there were different teacher emails for each teacher name, we need to provide the option of:

Merge existing tables A and B

  • This is the process the user would follow to reverse steps 1, 2, and 4 above. They would start with the resulting two tables and merge it back into the roster table.

Automatic creation of views

  • It might be helpful to automatically create a view that shows the list of fields previously shown in one table every time the user performs a data modeling action. This will allow them to see the same view of data as before (e.g. they can see all the roster information in one place), but have it be split into normalized tables underneath.
  • Perhaps we should ask the user whether to create a view or not.

Additional context

@kgodey kgodey added type: enhancement ready Ready for implementation labels Jun 9, 2021
@kgodey kgodey added this to the 8. Data Modeling milestone Jun 9, 2021
@mathemancer
Copy link
Contributor

The design may need to portray the concept of foreign keys (probably not in so many words) to help users understand the link between the created tables, and help them understand how entries are mated together when merging or moving a column from one table to another.

@kgodey kgodey changed the title Design for data model changes Design for data modeling Aug 23, 2021
@github-actions
Copy link

This issue has not been updated in 90 days and is being marked as stale.

@github-actions github-actions bot added the stale label Mar 11, 2022
@kgodey kgodey added status: draft and removed stale ready Ready for implementation labels Mar 11, 2022
@kgodey kgodey modified the milestones: [Beta] Data Modeling, Cycle 2 Jun 1, 2022
@kgodey kgodey modified the milestones: Cycle 2, Cycle 3 Jul 19, 2022
@kgodey
Copy link
Contributor Author

kgodey commented Jul 19, 2022

Although we are doing this feature in Cycle 3, this ticket no longer covers accurate design requirements, so I'm going to close it.

@kgodey kgodey closed this as completed Jul 19, 2022
Repository owner moved this from Draft to Done in [NO LONGER USED] Mathesar work tracker Jul 19, 2022
@kgodey kgodey closed this as not planned Won't fix, can't repro, duplicate, stale Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

2 participants