Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synapse won't start after upgrade attempt from 1.59.1 to 1.62: slow database migration means we get timed out by systemd #13193

Closed
Thatoo opened this issue Jul 5, 2022 · 14 comments
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@Thatoo
Copy link

Thatoo commented Jul 5, 2022

Description

Synapse keep restarting.
journalctl -xe indicates
The unit matrix-synapse.service has entered the 'failed' state with result 'timeout'.

tail -f -n 10 /var/log/matrix-synapse/homeserver.log
indicates as a loop

2022-07-06 00:21:28,496 - root - 343 - WARNING - main - ***** STARTING SERVER *****
2022-07-06 00:21:28,496 - root - 347 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.7/site-packages/synapse/app/homeserver.py version 1.62.0
2022-07-06 00:21:28,496 - root - 349 - INFO - main - Server hostname: defis.info
2022-07-06 00:21:28,496 - root - 350 - INFO - main - Instance name: master
2022-07-06 00:21:28,496 - synapse.app.homeserver - 377 - INFO - main - Setting up server
2022-07-06 00:21:28,496 - synapse.server - 306 - INFO - main - Setting up.
2022-07-06 00:21:28,497 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2022-07-06 00:21:28,497 - synapse.storage.databases - 72 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2022-07-06 00:21:28,497 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
2022-07-06 00:21:40,916 - synapse.storage.prepare_database - 123 - INFO - main - ['main', 'state']: Existing schema is 69 (+2 deltas)
2022-07-06 00:21:40,916 - synapse.storage.databases.main - 329 - INFO - main - Checking database for consistency with configuration...
2022-07-06 00:21:40,916 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v69
2022-07-06 00:21:40,917 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 69/02cache_invalidation_index.sql
2022-07-06 00:21:40,917 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v70
2022-07-06 00:21:40,918 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/01clean_table_purged_rooms.sql
2022-07-06 00:21:40,918 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/08_state_group_edges_unique.sql
2022-07-06 00:21:40,918 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v71
2022-07-06 00:21:40,919 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 71/01rebuild_event_edges.sql.sqlite
2022-07-06 00:23:03,526 - root - 343 - WARNING - main - ***** STARTING SERVER *****
2022-07-06 00:23:03,527 - root - 347 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.7/site-packages/synapse/app/homeserver.py version 1.62.0
2022-07-06 00:23:03,527 - root - 349 - INFO - main - Server hostname: defis.info
2022-07-06 00:23:03,527 - root - 350 - INFO - main - Instance name: master
2022-07-06 00:23:03,527 - synapse.app.homeserver - 377 - INFO - main - Setting up server
2022-07-06 00:23:03,527 - synapse.server - 306 - INFO - main - Setting up.
2022-07-06 00:23:03,527 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2022-07-06 00:23:03,527 - synapse.storage.databases - 72 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2022-07-06 00:23:03,528 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
2022-07-06 00:23:10,888 - synapse.storage.prepare_database - 123 - INFO - main - ['main', 'state']: Existing schema is 69 (+2 deltas)
2022-07-06 00:23:10,888 - synapse.storage.databases.main - 329 - INFO - main - Checking database for consistency with configuration...
2022-07-06 00:23:10,893 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v69
2022-07-06 00:23:10,896 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 69/02cache_invalidation_index.sql
2022-07-06 00:23:10,901 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v70
2022-07-06 00:23:10,901 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/01clean_table_purged_rooms.sql
2022-07-06 00:23:10,902 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/08_state_group_edges_unique.sql
2022-07-06 00:23:10,903 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v71
2022-07-06 00:23:10,904 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 71/01rebuild_event_edges.sql.sqlite

Steps to reproduce

after update from 1.59.1 to 1.62, synapse wpn't restart

Homeserver

matrix.defis.info

Synapse Version

1.62

Installation Method

No response

Platform

debian 10 proxmox container behind nginx reverse proxy

Relevant log output

Jul  6 00:37:12 Matrix systemd[1]: matrix-synapse.service: Start operation timed out. Terminating.
Jul  6 00:37:12 Matrix systemd[1]: matrix-synapse.service: Main process exited, code=killed, status=15/TERM
Jul  6 00:37:12 Matrix systemd[1]: matrix-synapse.service: Failed with result 'timeout'.
Jul  6 00:37:12 Matrix systemd[1]: Failed to start Synapse Matrix homeserver.
Jul  6 00:37:15 Matrix systemd[1]: matrix-synapse.service: Service RestartSec=3s expired, scheduling restart.
Jul  6 00:37:15 Matrix systemd[1]: matrix-synapse.service: Scheduled restart job, restart counter is at 20.
Jul  6 00:37:15 Matrix systemd[1]: Stopped Synapse Matrix homeserver.
Jul  6 00:37:15 Matrix systemd[1]: Starting Synapse Matrix homeserver...
Jul  6 00:37:18 Matrix matrix-synapse[1001]: This server is configured to use 'matrix.org' as its trusted key server via the
Jul  6 00:37:18 Matrix matrix-synapse[1001]: 'trusted_key_servers' config option. 'matrix.org' is a good choice for a key
Jul  6 00:37:18 Matrix matrix-synapse[1001]: server since it is long-lived, stable and trusted. However, some admins may
Jul  6 00:37:18 Matrix matrix-synapse[1001]: wish to use another server for this purpose.
Jul  6 00:37:18 Matrix matrix-synapse[1001]: To suppress this warning and continue using 'matrix.org', admins should set
Jul  6 00:37:18 Matrix matrix-synapse[1001]: 'suppress_key_server_warning' to 'true' in homeserver.yaml.
Jul  6 00:37:18 Matrix matrix-synapse[1001]: --------------------------------------------------------------------------------

Anything else that would be useful to know?

No response

@DMRobertson
Copy link
Contributor

DMRobertson commented Jul 5, 2022

How are you installing synapse? Docker image? Matrix.org debian package?

@Thatoo
Copy link
Author

Thatoo commented Jul 5, 2022

Matrix.org debian package : matrix-synapse-py3 package

@Thatoo
Copy link
Author

Thatoo commented Jul 5, 2022

I feel like the timeout happen before a work running on the db end
The unit matrix-synapse.service has entered the 'failed' state with result 'timeout'.
Can I stop synapse and run a command that would launch an update/migration/upgarde on the db that is required?
And then start the server again once the job on db is over? What would be this command?

If I guess wrong and if ypu're thinking about an other problem and have a solution to share, I'm eager to hear it.

@Thatoo
Copy link
Author

Thatoo commented Jul 5, 2022

If I put DEBUG instead of INFO in /etc/matrix-synapse/log.yaml here is the loop I get :

2022-07-06 01:55:20,461 - root - 343 - WARNING - main - ***** STARTING SERVER *****
2022-07-06 01:55:20,461 - root - 347 - WARNING - main - Server /opt/venvs/matrix-synapse/lib/python3.7/site-packages/synapse/app/homeserver.py version 1.62.0
2022-07-06 01:55:20,461 - root - 349 - INFO - main - Server hostname: defis.info
2022-07-06 01:55:20,461 - root - 350 - INFO - main - Instance name: master
2022-07-06 01:55:20,461 - synapse.app.homeserver - 377 - INFO - main - Setting up server
2022-07-06 01:55:20,461 - synapse.server - 306 - INFO - main - Setting up.
2022-07-06 01:55:20,462 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2022-07-06 01:55:20,462 - synapse.storage.databases - 72 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2022-07-06 01:55:20,462 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} BEGIN TRANSACTION
2022-07-06 01:55:20,462 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000024 sec
2022-07-06 01:55:20,462 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
2022-07-06 01:55:20,463 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} CREATE TABLE IF NOT EXISTS schema_version( Lock CHAR(1) NOT NULL DEFAULT 'X' UNIQUE, version INTEGER NOT NULL, upgraded BOOL NOT NULL, CHECK (Lock='X') )
2022-07-06 01:55:25,196 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 4.733506 sec
2022-07-06 01:55:25,197 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} CREATE TABLE IF NOT EXISTS schema_compat_version( Lock CHAR(1) NOT NULL DEFAULT 'X' UNIQUE, compat_version INTEGER NOT NULL, CHECK (Lock='X') )
2022-07-06 01:55:25,197 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000047 sec
2022-07-06 01:55:25,198 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} CREATE TABLE IF NOT EXISTS applied_schema_deltas( version INTEGER NOT NULL, file TEXT NOT NULL, UNIQUE(version, file) )
2022-07-06 01:55:25,198 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000024 sec
2022-07-06 01:55:25,199 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} CREATE TABLE IF NOT EXISTS applied_module_schemas( module_name TEXT NOT NULL, file TEXT NOT NULL, UNIQUE(module_name, file) )
2022-07-06 01:55:25,199 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000021 sec
2022-07-06 01:55:25,200 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} SELECT version, upgraded FROM schema_version
2022-07-06 01:55:25,200 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000055 sec
2022-07-06 01:55:25,201 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} SELECT compat_version FROM schema_compat_version
2022-07-06 01:55:25,201 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000038 sec
2022-07-06 01:55:25,201 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} SELECT file FROM applied_schema_deltas WHERE version >= ?
2022-07-06 01:55:25,202 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (69,)
2022-07-06 01:55:25,202 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000057 sec
2022-07-06 01:55:25,203 - synapse.storage.prepare_database - 123 - INFO - main - ['main', 'state']: Existing schema is 69 (+2 deltas)
2022-07-06 01:55:25,203 - synapse.storage.databases.main - 329 - INFO - main - Checking database for consistency with configuration...
2022-07-06 01:55:25,204 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} SELECT name FROM users LIMIT 1
2022-07-06 01:55:25,204 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000052 sec
2022-07-06 01:55:25,204 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DELETE FROM schema_compat_version
2022-07-06 01:55:25,208 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.003304 sec
2022-07-06 01:55:25,209 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO schema_compat_version(compat_version) VALUES (?)
2022-07-06 01:55:25,209 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (71,)
2022-07-06 01:55:25,210 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000054 sec
2022-07-06 01:55:25,210 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v69
2022-07-06 01:55:25,211 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DELETE FROM schema_version
2022-07-06 01:55:25,213 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.002175 sec
2022-07-06 01:55:25,214 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO schema_version (version, upgraded) VALUES (?,?)
2022-07-06 01:55:25,214 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (69, True)
2022-07-06 01:55:25,215 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000067 sec
2022-07-06 01:55:25,215 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 69/02cache_invalidation_index.sql
2022-07-06 01:55:25,216 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO background_updates (ordering, update_name, progress_json) VALUES (6902, 'cache_invalidation_index_by_instance', '{}')
2022-07-06 01:55:25,218 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.002078 sec
2022-07-06 01:55:25,219 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO applied_schema_deltas (version, file) VALUES (?,?)
2022-07-06 01:55:25,219 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (69, '69/02cache_invalidation_index.sql')
2022-07-06 01:55:25,221 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.002086 sec
2022-07-06 01:55:25,222 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v70
2022-07-06 01:55:25,222 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DELETE FROM schema_version
2022-07-06 01:55:25,223 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000012 sec
2022-07-06 01:55:25,223 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO schema_version (version, upgraded) VALUES (?,?)
2022-07-06 01:55:25,224 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (70, True)
2022-07-06 01:55:25,224 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000025 sec
2022-07-06 01:55:25,225 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/01clean_table_purged_rooms.sql
2022-07-06 01:55:25,225 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DELETE FROM federation_inbound_events_staging WHERE room_id not in ( SELECT room_id FROM rooms )
2022-07-06 01:55:25,226 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000058 sec
2022-07-06 01:55:25,226 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO applied_schema_deltas (version, file) VALUES (?,?)
2022-07-06 01:55:25,227 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (70, '70/01clean_table_purged_rooms.sql')
2022-07-06 01:55:25,227 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000016 sec
2022-07-06 01:55:25,227 - synapse.storage.prepare_database - 519 - INFO - main - Applying schema 70/08_state_group_edges_unique.sql
2022-07-06 01:55:25,228 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO background_updates (ordering, update_name, progress_json) VALUES (7008, 'state_group_edges_unique_idx', '{}')
2022-07-06 01:55:25,228 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000034 sec
2022-07-06 01:55:25,229 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO applied_schema_deltas (version, file) VALUES (?,?)
2022-07-06 01:55:25,229 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (70, '70/08_state_group_edges_unique.sql')
2022-07-06 01:55:25,230 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000016 sec
2022-07-06 01:55:25,230 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v71
2022-07-06 01:55:25,230 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DELETE FROM schema_version
2022-07-06 01:55:25,231 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000012 sec
2022-07-06 01:55:25,231 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO schema_version (version, upgraded) VALUES (?,?)
2022-07-06 01:55:25,232 - synapse.storage.SQL - 380 - DEBUG - main - [SQL values] {prepare_database} (71, True)
2022-07-06 01:55:25,232 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.000025 sec
2022-07-06 01:55:25,233 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 71/01rebuild_event_edges.sql.sqlite
2022-07-06 01:55:25,233 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} CREATE TABLE new_event_edges ( event_id TEXT NOT NULL, prev_event_id TEXT NOT NULL, room_id TEXT NULL, is_state BOOL NOT NULL DEFAULT 0, FOREIGN KEY(event_id) REFERENCES events(event_id) )
2022-07-06 01:55:25,240 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 0.005741 sec
2022-07-06 01:55:25,240 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} INSERT INTO new_event_edges SELECT ee.event_id, ee.prev_event_id, ee.room_id, ee.is_state FROM event_edges ee JOIN events ev USING (event_id) WHERE NOT ee.is_state
2022-07-06 01:56:21,452 - synapse.storage.SQL - 401 - DEBUG - main - [SQL time] {prepare_database} 56.211636 sec
2022-07-06 01:56:21,452 - synapse.storage.SQL - 372 - DEBUG - main - [SQL] {prepare_database} DROP TABLE event_edges

should I stop synapse and run these three sql command myself
CREATE TABLE new_event_edges ( event_id TEXT NOT NULL, prev_event_id TEXT NOT NULL, room_id TEXT NULL, is_state BOOL NOT NULL DEFAULT 0, FOREIGN KEY(event_id) REFERENCES events(event_id) );
INSERT INTO new_event_edges SELECT ee.event_id, ee.prev_event_id, ee.room_id, ee.is_state FROM event_edges ee JOIN events ev USING (event_id) WHERE NOT ee.is_state;
DROP TABLE event_edges;
and then restart synapse server?

@Thatoo
Copy link
Author

Thatoo commented Jul 6, 2022

or would it be possible to increase the time given to synapse to start before timeout?

@DMRobertson DMRobertson changed the title Synapse won't start after upgrade attempt from 1.69.1 to 1.62 Synapse won't start after upgrade attempt from 1.59.1 to 1.62 Jul 6, 2022
@DMRobertson
Copy link
Contributor

DMRobertson commented Jul 6, 2022

Can I stop synapse and run a command that would launch an update/migration/upgarde on the db that is required?

Yes. Try running update_synapse_database --database-config /etc/matrix-synapse/homeserver.yaml.

or would it be possible to increase the time given to synapse to start before timeout?

You could edit the systemd config, see e.g. https://unix.stackexchange.com/a/276785/190092. But I'm not sure how that interacts with the matrix.org debian packages---on future upgrades, you might be asked to resolve a conflict between the packaged systemd unit file and the edited one. I'd try to update the DB first.

@DMRobertson DMRobertson added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 6, 2022
@DMRobertson
Copy link
Contributor

@Thatoo please let me know if you manage to update your database and start synapse afterwards.

@Thatoo
Copy link
Author

Thatoo commented Jul 6, 2022

We decided to restore a backup to v 1.59.1 and it works

@Thatoo Thatoo closed this as completed Jul 6, 2022
@podiki
Copy link

podiki commented Jul 21, 2022

In case it helps anyone else, I think I had the same problem on update to 1.63.0 (not sure the previous version, perhaps 1.61), some of the log:

Jul 21 15:05:36  synapse[710]: 2022-07-21 15:05:36,359 - synapse.app.homeserver - 377 - INFO - main - Setting up server
Jul 21 15:05:36  synapse[710]: 2022-07-21 15:05:36,359 - synapse.server - 306 - INFO - main - Setting up.
Jul 21 15:05:36  synapse[710]: 2022-07-21 15:05:36,361 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
Jul 21 15:05:36  synapse[710]: 2022-07-21 15:05:36,361 - synapse.storage.databases - 69 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
Jul 21 15:05:36  synapse[710]: 2022-07-21 15:05:36,363 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
Jul 21 15:06:08  synapse[710]: 2022-07-21 15:06:08,477 - synapse.storage.prepare_database - 119 - INFO - main - ['main', 'state']: Existing schema is 71 (+0 deltas)
Jul 21 15:06:08  synapse[710]: 2022-07-21 15:06:08,479 - synapse.storage.databases.main - 329 - INFO - main - Checking database for consistency with configuration...
Jul 21 15:06:08  synapse[710]: 2022-07-21 15:06:08,489 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v71
Jul 21 15:06:08  synapse[710]: 2022-07-21 15:06:08,491 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 71/01rebuild_event_edges.sql.sqlite
Jul 21 15:06:59  systemd[1]: synapse.service: start operation timed out. Terminating.
Jul 21 15:07:00  systemd[1]: synapse.service: Failed with result 'timeout'.
Jul 21 15:07:00  systemd[1]: Failed to start Synapse Matrix homeserver (master).
Jul 21 15:07:00  systemd[1]: synapse.service: Consumed 17.541s CPU time.
Jul 21 15:07:03  systemd[1]: synapse.service: Scheduled restart job, restart counter is at 5.

Running sudo update_synapse_database --database-config /etc/synapse/homeserver.yaml (note that --config is not an option) took about 6 minutes (on a Pi) but completed successfully. Synapse now starts up just fine, thanks @DMRobertson!

@DMRobertson
Copy link
Contributor

(note that --config is not an option)

Ooops, thanks. I'ved edited #13193 (comment) to fix this.

@DMRobertson DMRobertson changed the title Synapse won't start after upgrade attempt from 1.59.1 to 1.62 Synapse won't start after upgrade attempt from 1.59.1 to 1.62: slow database migration means we get timed out by systemd Jul 25, 2022
@reivilibre
Copy link
Contributor

This happened to me too, but in my case it was because a pg_dump was running to back up my database, that was holding enough of a lock preventing the schema migrations from progressing.
In your systemd service file, you can set TimeoutSec=300 or something to give it more time.

@DMRobertson
Copy link
Contributor

DMRobertson commented Aug 11, 2022

Another report of a similar situation in synapse admins today. I could have sworn we thought that we tell systemd that we're READY before starting the migrations... but maybe I misremember.

I wonder if we can use EXTEND_TIMEOUT_USEC during a migration to communicate that we're still alive and making forward progress with the migration.

@richvdh
Copy link
Member

richvdh commented Aug 12, 2022

Another report of a similar situation in synapse admins today. I could have sworn we thought that we tell systemd that we're READY before starting the migrations... but maybe I misremember.

definitely not: we make sure the database is up to date before starting the http listeners, and we want to start the http listeners before we tell systemd we're ready.

I wonder if we can use EXTEND_TIMEOUT_USEC during a migration to communicate that we're still alive and making forward progress with the migration.

possibly, but note that the problem here is a single big migration (specifically, 71/01rebuild_event_edges.sql.sqlite). It's hard to know that we are actually making progress while that's going on.

The postgres equivalent makes a lot of effort to make sure that the migration happens quickly (by pushing most of the hard work to the background), but I didn't really bother for sqlite, because if you've got a big enough sqlite database to notice the delay, you're doing it wrong.

So really I think this is yet another case of: we need to do more to discourage the use of sqlite in production (cf #2317, #2917)

@podiki
Copy link

podiki commented Dec 28, 2022

Happened again for me and had to run the database update manually.

I think this issue should be re-opened (with a more general title) until a proper fix is in place or documented workaround (or just output in synapse log if something is expected to take a while and may need manual intervention?).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

5 participants