Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration failed by "error in table task after rename: duplicate column name: message" #17328

Closed
wxiaoguang opened this issue Oct 16, 2021 · 12 comments · Fixed by #17475
Closed
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug
Milestone

Comments

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Oct 16, 2021

Gitea Version

1.15.4

Description

Many users report that the migration fails:

2021/10/16 08:45:14 ...ations/migrations.go:422:Migrate() [I] [SQL] ALTER TABLE `task` RENAME COLUMN errors TO message [] - 6.592953ms
2021/10/16 08:45:14 ...ations/migrations.go:422:Migrate() [I] [SQL] ROLLBACK [] - 113.064µs
2021/10/16 08:45:14 routers/common/db.go:33:InitDBEngine() [E] ORM engine initialization attempt #1/10 failed. Error: migrate: do migrate: error in table task after rename: duplicate column name: message
2021/10/16 08:45:14 routers/common/db.go:34:InitDBEngine() [I] Backing off for 3 seconds
2021/10/16 08:45:17 routers/common/db.go:27:InitDBEngine() [I] ORM engine initialization attempt #2/10...

Removing the message field in task table resolves this problem and can complete the migration.

The problem seems to be caused by Gitea accessed the task ORM too early (before migration), so the task table was changed (message was added while error also exists) before the migration and then the conflict occurs.

@wxiaoguang wxiaoguang added type/bug issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented labels Oct 16, 2021
@zeripath
Copy link
Contributor

I don't understand where this could be happening - we'd need to be calling Sync2() before the Migrate function runs but I can't find any example of this.

The interesting thing is that not every user reports this - which suggests that whatever is going on is either some kind of race or it's related to something else.

@wxiaoguang
Copy link
Contributor Author

Yep, I cannot reproduce this bug either. One user was using pgsql while another was using sqlite3. My only guess is whether there is some code triggering task early before a long running migration finishes.

@noerw noerw modified the milestone: 1.15.5 Oct 17, 2021
@wxiaoguang
Copy link
Contributor Author

Is it possible caused by some git push when migration is running? I just guess that git push triggers hooks to call gitea binary.

@lunny
Copy link
Member

lunny commented Oct 18, 2021

Is it possible caused by some git push when migration is running? I just guess that git push triggers hooks to call gitea binary.

It's impossible. When migrating, web listener is not up, so internal routes will not accept any requests.

@zeripath
Copy link
Contributor

Hmm ... Could this be due to having more than one entry in the version table?

I've seen a few people who get weird partial dB upgrades in the past and I don't understand why - maybe that would do it?

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Oct 21, 2021

Just remember another issue: #16828 , similar problem, both message and errors exist in task table.

@wxiaoguang
Copy link
Contributor Author

wxiaoguang commented Oct 21, 2021

Hmm ... Could this be due to having more than one entry in the version table?

I've seen a few people who get weird partial dB upgrades in the past and I don't understand why - maybe that would do it?

The code here https://github.com/go-gitea/gitea/blob/main/models/migrations/migrations.go#L416-L423 (using auto generated id) can be buggy if a user ran delete * from version, the next auto-increment id will be id=2, then https://github.com/go-gitea/gitea/blob/main/models/migrations/migrations.go#L363-L367 will never find the version record, then insert a new record id=3, then id=4, and so on.

I am not sure whether it is really related.

@jswolf19
Copy link

jswolf19 commented Oct 28, 2021

Going from 1.14.76 to 1.15.0/1.15.5, I'm running into this issue. I'll post the log, including relevant info:

The relevant call stacks seem to be

routers/install/install.go:210:SubmitInstall()
models/models.go:192:NewTestEngine()
models/models.go:179:syncTables()
return x.StoreEngine("InnoDB").Sync2(tables...)

is coming before

routers/install/install.go:413:SubmitInstall()
routers/install/setting.go:43:ReloadSettings()
routers/common/db.go:28:InitDBEngine()
models.NewEngine(ctx, migrations.Migrate)

log(1.14.76→1.15.5)

2021/10/28 14:15:13 cmd/web.go:102:runWeb() [I] Starting Gitea on PID: 1
2021/10/28 14:15:13 ...s/install/setting.go:21:PreloadSettings() [I] AppPath: /usr/bin/gitea
2021/10/28 14:15:13 ...s/install/setting.go:22:PreloadSettings() [I] AppWorkPath: /usr/share/gitea
2021/10/28 14:15:13 ...s/install/setting.go:23:PreloadSettings() [I] Custom path: /etc/gitea
2021/10/28 14:15:13 ...s/install/setting.go:24:PreloadSettings() [I] Log path: /var/log/gitea
2021/10/28 14:15:13 ...s/install/setting.go:25:PreloadSettings() [I] Preparing to run install page
2021/10/28 14:15:13 ...s/install/setting.go:28:PreloadSettings() [I] SQLite3 Supported
2021/10/28 14:15:13 cmd/web.go:196:listen() [I] Listen: http://0.0.0.0:3000
2021/10/28 14:15:13 cmd/web.go:199:listen() [I] LFS server enabled
2021/10/28 14:15:13 ...s/graceful/server.go:62:NewServer() [I] Starting new Web server: tcp:0.0.0.0:3000 on PID: 1
2021/10/28 14:15:15 ...ers/common/logger.go:21:1() [I] Started GET / for 10.40.2.189:58266
2021/10/28 14:15:15 ...ers/common/logger.go:30:1() [I] Completed GET / 200 OK in 80.031096ms
2021/10/28 14:15:15 ...ers/common/logger.go:21:1() [I] Started GET /assets/css/index.css?v=66b87c123c0ab767e55d95c43d170494 for 10.40.2.189:58268
2021/10/28 14:15:15 ...ers/common/logger.go:21:1() [I] Started GET /assets/js/index.js?v=66b87c123c0ab767e55d95c43d170494 for 10.40.2.189:58270
2021/10/28 14:15:15 ...ers/common/logger.go:30:1() [I] Completed GET /assets/js/index.js?v=66b87c123c0ab767e55d95c43d170494 200 OK in 52.27514ms
2021/10/28 14:15:16 ...ers/common/logger.go:30:1() [I] Completed GET /assets/css/index.css?v=66b87c123c0ab767e55d95c43d170494 200 OK in 836.68163ms
2021/10/28 14:15:16 ...ers/common/logger.go:21:1() [I] Started GET /assets/img/loading.png for 10.40.2.189:58272
2021/10/28 14:15:16 ...ers/common/logger.go:30:1() [I] Completed GET /assets/img/loading.png 200 OK in 195.34µs
2021/10/28 14:15:38 ...ers/common/logger.go:21:1() [I] Started POST / for 10.40.2.189:58552
2021/10/28 14:15:38 ...s/install/install.go:210:SubmitInstall() [I] [SQL] SELECT `TABLE_NAME`, `ENGINE`, `AUTO_INCREMENT`, `TABLE_COMMENT` from `INFORMATION_SCHEMA`.`TABLES` WHERE `TABLE_SCHEMA`=? AND (`ENGINE`='MyISAM' OR `ENGINE` = 'InnoDB' OR `ENGINE` = 'TokuDB') [gitea] - 37.920296ms
2021/10/28 14:15:38 models/models.go:192:NewTestEngine() [I] [SQL] SELECT `COLUMN_NAME`, `IS_NULLABLE`, `COLUMN_DEFAULT`, `COLUMN_TYPE`, `COLUMN_KEY`, `EXTRA`, `COLUMN_COMMENT`, (INSTR(VERSION(), 'maria') > 0 && (SUBSTRING_INDEX(VERSION(), '.', 1) > 10 || (SUBSTRING_INDEX(VERSION(), '.', 1) = 10 && (SUBSTRING_INDEX(SUBSTRING(VERSION(), 4), '.', 1) > 2 || (SUBSTRING_INDEX(SUBSTRING(VERSION(), 4), '.', 1) = 2 && SUBSTRING_INDEX(SUBSTRING(VERSION(), 6), '-', 1) >= 7))))) AS NEEDS_QUOTE FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_SCHEMA` = ? AND `TABLE_NAME` = ? ORDER BY `COLUMNS`.ORDINAL_POSITION [gitea user] - 1.024869ms

... //line 22-151 omitted

2021/10/28 14:15:39 models/models.go:192:NewTestEngine() [I] [SQL] SELECT `COLUMN_NAME`, `IS_NULLABLE`, `COLUMN_DEFAULT`, `COLUMN_TYPE`, `COLUMN_KEY`, `EXTRA`, `COLUMN_COMMENT`, (INSTR(VERSION(), 'maria') > 0 && (SUBSTRING_INDEX(VERSION(), '.', 1) > 10 || (SUBSTRING_INDEX(VERSION(), '.', 1) = 10 && (SUBSTRING_INDEX(SUBSTRING(VERSION(), 4), '.', 1) > 2 || (SUBSTRING_INDEX(SUBSTRING(VERSION(), 4), '.', 1) = 2 && SUBSTRING_INDEX(SUBSTRING(VERSION(), 6), '-', 1) >= 7))))) AS NEEDS_QUOTE FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_SCHEMA` = ? AND `TABLE_NAME` = ? ORDER BY `COLUMNS`.ORDINAL_POSITION [gitea task] - 1.724384ms
2021/10/28 14:15:39 models/models.go:192:NewTestEngine() [I] [SQL] SELECT `INDEX_NAME`, `NON_UNIQUE`, `COLUMN_NAME` FROM `INFORMATION_SCHEMA`.`STATISTICS` WHERE `TABLE_SCHEMA` = ? AND `TABLE_NAME` = ? [gitea task] - 700.628µs
2021/10/28 14:15:39 models/models.go:192:NewTestEngine() [I] [SQL] ALTER TABLE `task` ADD `message` TEXT NULL  [] - 54.830699ms
2021/10/28 14:15:39 ...om/go-chi/chi/mux.go:436:routeHTTP() [W] Table task has column errors but struct has not related field

...// line 155-188 omitted

2021/10/28 14:15:40 routers/common/db.go:20:InitDBEngine() [I] Beginning ORM engine initialization.
2021/10/28 14:15:40 routers/common/db.go:27:InitDBEngine() [I] ORM engine initialization attempt #1/10...
2021/10/28 14:15:40 modules/web/route.go:64:func1() [I] PING DATABASE mysql
2021/10/28 14:15:40 ...ations/migrations.go:419:Migrate() [I] Migration[178]: Add LFS columns to Mirror
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column id but struct has not related field
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column repo_id but struct has not related field
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column interval but struct has not related field
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column enable_prune but struct has not related field
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column updated_unix but struct has not related field
2021/10/28 14:15:41 routers/common/db.go:28:InitDBEngine() [W] Table mirror has column next_update_unix but struct has not related field
2021/10/28 14:15:41 ...ations/migrations.go:419:Migrate() [I] Migration[179]: Convert avatar url to text
2021/10/28 14:15:41 ...ations/migrations.go:419:Migrate() [I] Migration[180]: Delete credentials from past migrations
2021/10/28 14:15:41 ...ations/migrations.go:419:Migrate() [I] Migration[181]: Always save primary email on email address table
2021/10/28 14:15:41 ...s/install/setting.go:43:ReloadSettings() [W] Table email_address Column lower_email db nullable is false, struct nullable is true
2021/10/28 14:15:42 ...ations/migrations.go:419:Migrate() [I] Migration[182]: Add issue resource index table
2021/10/28 14:15:42 ...ations/migrations.go:419:Migrate() [I] Migration[183]: Create PushMirror table
2021/10/28 14:15:42 ...ations/migrations.go:419:Migrate() [I] Migration[184]: Rename Task errors to message
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column id but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column doer_id but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column owner_id but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column repo_id but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column start_time but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column end_time but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column payload_content but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column created but struct has not related field
2021/10/28 14:15:42 ...s/install/setting.go:43:ReloadSettings() [W] Table task has column message but struct has not related field
2021/10/28 14:15:42 routers/common/db.go:33:InitDBEngine() [E] ORM engine initialization attempt #1/10 failed. Error: migrate: do migrate: Error 1060: Duplicate column name 'message'

@zeripath
Copy link
Contributor

zeripath commented Oct 28, 2021

AHA!

Thank you this is exactly the help I needed to understand what the problem is.

The problem is on line 193 here:

gitea/models/models.go

Lines 182 to 193 in 6110ddc

// NewTestEngine sets a new test xorm.Engine
func NewTestEngine() (err error) {
x, err = GetNewEngine()
if err != nil {
return fmt.Errorf("Connect to database: %v", err)
}
x.SetMapper(names.GonicMapper{})
x.SetLogger(NewXORMLogger(!setting.IsProd()))
x.ShowSQL(!setting.IsProd())
return syncTables()
}

In 1.16 this has been moved to:

gitea/models/db/engine.go

Lines 131 to 142 in 157de0f

// NewTestEngine sets a new test xorm.Engine
func NewTestEngine() (err error) {
x, err = GetNewEngine()
if err != nil {
return fmt.Errorf("Connect to database: %v", err)
}
x.SetMapper(names.GonicMapper{})
x.SetLogger(NewXORMLogger(!setting.IsProd))
x.ShowSQL(!setting.IsProd)
return syncTables()
}

The syncTables function is being run there when it should really just run migration.

Now the question is: Why are you running the install page again?! It was not intended that you would run the install page again.

zeripath added a commit to zeripath/gitea that referenced this issue Oct 28, 2021
The underlying problem in go-gitea#17328 appears to be that users are re-running the install
page during upgrades. The function that tests and creates the db did not intend for
this and thus instead the migration scripts being run - a simple sync tables occurs.

This then causes a weird partially migrated DB which causes, in this release cycle,
the duplicate column in task table error. It is likely the cause of some weird
partial migration errors in other cycles too.

This PR simply ensures that the migration scripts are also run at this point too.

Fix go-gitea#17328

Signed-off-by: Andrew Thornton <[email protected]>
@jswolf19
Copy link

jswolf19 commented Oct 28, 2021

@zeripath I'm using docker, and I create a new image to switch over to when I upgrade. So I guess the reason I run the install page again is that I'm not sure how to not do it; it pops up when I start up the container for the first time.

(I imagine it's because my ini file isn't in a volume)

@jswolf19
Copy link

jswolf19 commented Oct 29, 2021

I set INSTALL_LOCK to true in the gitea.ini file my Dockerfile uses and was able to spin up an instance with 1.14.6 data and 1.15.6 installed. It seems to be running without issue, getting past migration 184.

zeripath added a commit that referenced this issue Oct 29, 2021
* Run Migrate in Install rather than just SyncTables

The underlying problem in #17328 appears to be that users are re-running the install
page during upgrades. The function that tests and creates the db did not intend for
this and thus instead the migration scripts being run - a simple sync tables occurs.

This then causes a weird partially migrated DB which causes, in this release cycle,
the duplicate column in task table error. It is likely the cause of some weird
partial migration errors in other cycles too.

This PR simply ensures that the migration scripts are also run at this point too.

Fix #17328

Signed-off-by: Andrew Thornton <[email protected]>
zeripath added a commit to zeripath/gitea that referenced this issue Oct 29, 2021
Backport go-gitea#17475

The underlying problem in go-gitea#17328 appears to be that users are re-running the install
page during upgrades. The function that tests and creates the db did not intend for
this and thus instead the migration scripts being run - a simple sync tables occurs.

This then causes a weird partially migrated DB which causes, in this release cycle,
the duplicate column in task table error. It is likely the cause of some weird
partial migration errors in other cycles too.

This PR simply ensures that the migration scripts are also run at this point too.

Fix go-gitea#17328

Signed-off-by: Andrew Thornton <[email protected]>
zeripath added a commit that referenced this issue Oct 30, 2021
Backport #17475

The underlying problem in #17328 appears to be that users are re-running the install
page during upgrades. The function that tests and creates the db did not intend for
this and thus instead the migration scripts being run - a simple sync tables occurs.

This then causes a weird partially migrated DB which causes, in this release cycle,
the duplicate column in task table error. It is likely the cause of some weird
partial migration errors in other cycles too.

This PR simply ensures that the migration scripts are also run at this point too.

Fix #17328

Signed-off-by: Andrew Thornton <[email protected]>
@wxiaoguang
Copy link
Contributor Author

Just to record here: we have an issue tracking this problem

@wxiaoguang wxiaoguang added this to the 1.16.0 milestone Nov 18, 2021
Chianina pushed a commit to Chianina/gitea that referenced this issue Mar 28, 2022
* Run Migrate in Install rather than just SyncTables

The underlying problem in go-gitea#17328 appears to be that users are re-running the install
page during upgrades. The function that tests and creates the db did not intend for
this and thus instead the migration scripts being run - a simple sync tables occurs.

This then causes a weird partially migrated DB which causes, in this release cycle,
the duplicate column in task table error. It is likely the cause of some weird
partial migration errors in other cycles too.

This PR simply ensures that the migration scripts are also run at this point too.

Fix go-gitea#17328

Signed-off-by: Andrew Thornton <[email protected]>
@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants