-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to 4.x sql duplicate key #37141
Comments
It would only be the inserts that include the primary incremental record id that would need changing so that should only be the ones for workflows and maybe finder |
Using a "replace" statement in MySQL or doing the equivalent in PostgreSQL is called "upsert". Unfortunately the minimum version requirements of J4 don't allow to use that because we still have to support older MySQL and MariaDB versions which do not support that. It could be solved by changing the insert statements to be done record by record, and for each record they use a select statement as a data source, similar as we do when we insert into the menu table and select the extension ID from the extensions table and put the result together with literal values for the other columns. That select statement would contain a condition which returns no result when there is already a record with that ID (or other PK which would be violated). We have done something similar for inserting the pkg_search record when updating. That solution would have 2 disadvantages:
|
The minimum mysql version supported is 5.6 and that does support replace https://dev.mysql.com/doc/refman/5.6/en/replace.html |
there is also INSERT ON DUPLICATE KEY UPDATE and INSERT IGNORE or am I missing something? |
Well, then it was either MariaDB or PostgreSQL where there was some problem why we could not do that. Another problem is that even when we fix the duplicate key issue, there will be other issues which we can't fix with SQL only, see my comments in and it is also worth a question if we should leave the people with the illusion that their site is fine when we fix some of the glitches after failed updates but other not so visible glitches are remaining. |
MariaDB supports all the methods I described. It is postgres that does not and would require a more complicated solution. However as 98% of all installs are on mysql/mariadb just changing those updates would be a massive improvement. As only a small % of updates hit this issue then there will not be very many running postgresql at all. Regarding the other issues. I agree with you and its why I introduced the logging although it still says installation success at the end when the sql updates terminated :( but lets fix the issues we can fix now. And no I wont complain about immutable sql files this time (especially if it is commented as you did with the child templates) |
I will investigate again on weekend and report back. But I'm still not sure if we should do something with it because it will not solve other problems like indexes already existing and MySQL not supporting an IX NOT EXISTS for adding columns and so on. |
found this for postgres upsert |
and for adding columns can we use a stored procedure? https://thispointer.com/mysql-add-column-if-not-exists/ |
We have to check since which version upsert is supported in PostgreSQL. I think it could be sufficient just to use INSERT IGNORE at some places. Regarding stored procedures: That could be a way, but the SQL using that would not be handled by the database checker. |
See also #35664 . |
As far as I can see we could use The question is if we really need that updating of existing records or if it would not just be enough to do nothing on conflict, i.e. not insert or update anything when the record already exists, because in the cases handled here (new stuff e.g. for workflows added with an update SQL) we can assume that the existing records after a first, unsuccessful update attempt will not be modified and nobody will create new workflows stuff in backend manually before the 2nd update attempt. Also a way could be just to delete the records in the tables for workflows before inserting the new records (but after the For other stuff which we insert on updates, like e.g. records in the menu table or in the extensions table or the assets table, we do not use hard-coded IDs. In such a case, when we run the update SQL again, we insert again new items for the same thing but with a different ID. This will result in redundant records, and an update on a violated primary key will not help. We would need other conditions (or keys) to identify what "the same thing" means in a particular table, like e.g. the combination of Another way to solve this and other issues would be PR #36506 , but I am not sure if it is sufficient to just allow the SQL to fail or if we would need some kind of condition which failure we allow. The correct update of the schema version in the database after each successful run of an SQL update script which that PR adds is a necessary fix independently from that condition thing. |
I think for the duplicate keys we can assume that nobody modified the records in the mean time, so we do not need to update them if we run the update SQL script a second time, we simply need to do nothing when the record with that PK already exists. That means we can use For the latter see https://www.postgresqltutorial.com/postgresql-upsert/ . I will prepare a PR in the next days. |
If you are curious: See #36506 (comment) ... I have created a branch with the change from PR #36506 and the necessary changes on the old update SQL scripts plus the change from #37154 . The changes can be seen here: 4.1-dev...richard67:test-pr-36506 . It only needs some additional comments in the old update SQL scripts. PR will come tomorrow, I think. |
PR is #37156 , but as long as it's work in progress we leave this issue here open. The code changes are already done, it just needs to add comments to the changes and complete the PR description and testing instructions. |
I just came to close it but if you want it to stay open for now I am happy with that also |
Closing as having a pull request. Please test #37156 . If you have tested that with success, you can also mark a successful test result for #36506 . And in my PR #37156 please check section "RFC: What is worth a discussion - feedback is welcome" in the description and let me know your opinion. Thanks in advance. |
Update: The test of my PR does not cover all changes in the other PR #36506 , so it needs to test both PRs. |
One of the errors that is coming up quite often is a result of a user trying a failed update for a second time and running into this sql error
There are several options available to us to resolve by amending the query "insert into ..."
In our specific case the most relevant would appear to be changing the query to "replace into ..."
When issuing a REPLACE statement, there are two possible outcomes for each issued command:
@richard67 thoughts? You know way more than me on sql
The text was updated successfully, but these errors were encountered: