Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylc.rundb: add task jobs table and refactor #1493

Merged
merged 5 commits into from
Jun 16, 2015

Conversation

matthewrmshin
Copy link
Contributor

  • Simplify by removing threading logic.
  • Group insert and similar update statements, and write with executemany for efficiency.
    • A single queue per table for inserts.
    • A queue for each unique update statement template.
  • On data retrieval, use single select statement to get every relevant items.
  • Die immediately on any problem with private database to guarantee integrity.
    • Database can only be out of sync by a single iteration on the main loop.
  • Unlimited number of retries for public database.
    Retry occurs on next iteration of the main loop - no sleep required.
    Suite can only be brought down if queues become so big that memory runs out.
    This is unlikely, because things should recover sooner or later.
  • On restart,
    • Vacuum private database
    • Copy private database into public database to ensure they are in sync.
  • Add new task_jobs table.
  • Add new broadcasts table to record broadcast settings in searchable form.
    Remove broadcast_settings table, which stored unsearchable pickles.

@matthewrmshin matthewrmshin self-assigned this Jun 9, 2015
@matthewrmshin matthewrmshin added this to the soon milestone Jun 9, 2015
@matthewrmshin
Copy link
Contributor Author

I have done some stress tests on this, and I have yet to see a "database is locked" error on writing to the databases.

(This is ready for initial comments, but not for merge, because I want to add some automated tests for the new tables.)

@matthewrmshin
Copy link
Contributor Author

(Stress test based on https://gist.github.com/matthewrmshin/c5e301e060a1987afaf7 with the succeeded handler of each task doing a query on the public database. A loop is also set up to run continuously to query the public database on a separate terminal.)

@matthewrmshin
Copy link
Contributor Author

@cylc/core please review.

return self.FMT_CREATE % {
"name": self.name,
"columns_str": ", ".join(column_str_list),
"primary_keys_str": primary_keys_str}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just do a .join on primary_keys as you do for column_str_list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because not every table has a primary key?

@arjclark
Copy link
Contributor

@matthewrmshin - from a quick pass through, I can't work out how it will cope with a temporary database lock (hence retries in the old system). Also, I don't think this is backwards compatible - you need to create the new table in an existing suite database.

@matthewrmshin
Copy link
Contributor Author

from a quick pass through, I can't work out how it will cope with a temporary database lock (hence retries in the old system)

For private database, it does not retry because it should never be necessary.

For public database, it retries when the main loop calls self.pool.process_queued_db_ops() again in the next iteration.

@matthewrmshin
Copy link
Contributor Author

Also, I don't think this is backwards compatible - you need to create the new table in an existing suite database.

No, but do we need to support backward compatibility for restart? I think it is always safer to stick with the same version of cylc on a restart. (Rose's suite commands reinforce that any way.)

@dpmatthews
Copy link
Contributor

For public database, it retries when the main loop calls self.pool.process_queued_db_ops() again in the next iteration.

Is there a limit on the number of retries? Other suites may be relying on the public database to check for task completions (and the hook scripts currently rely on it) so we can't afford to let it get too behind..

@hjoliver
Copy link
Member

@matthewrmshin - regarding your stress testing, out of interest do you see locks when doing the same testing on current master? Also, your gist link seems to be broken.

@hjoliver
Copy link
Member

Is there a limit on the number of retries? Other suites may be relying on the public database to check for task completions (and the hook scripts currently rely on it) so we can't afford to let it get too behind..

This a tough call... couldn't it be argued that important suites should carry on no matter what, unless their private db is somehow locked or broken? We could just warn the suite operator of the problem, and recommend a restart as soon as possible.

@hjoliver
Copy link
Member

do we need to support backward compatibility for restart?

So far we have tried to do that.

@arjclark
Copy link
Contributor

do we need to support backward compatibility for restart?

From the way we do things here during a PS, yes we do.

@arjclark
Copy link
Contributor

...Other suites may be relying on the public database to check for task completions (and the hook scripts currently rely on it) so we can't afford to let it get too behind..

This a tough call... couldn't it be argued that important suites should carry on no matter what, unless their private db is somehow locked or broken? We could just warn the suite operator of the problem, and recommend a restart as soon as possible.

Problem is though, if the external db gets too out of sync then it's useless for external triggering, current hook scripts, rose-bush like utilities etc. Re-synchronisation then becomes an increasingly large problem on the restart as well - and by the looks of things there's no dump any more(?) without that the external will remain out of sync on a restart.

@arjclark
Copy link
Contributor

You've also removed user only access to the private database, can you put that back in please? It ensured only the user's suite process and their suite processes so was a very useful feature to have.

@matthewrmshin
Copy link
Contributor Author

Is there a limit on the number of retries? Other suites may be relying on the public database to check for task completions (and the hook scripts currently rely on it) so we can't afford to let it get too behind..

This a tough call... couldn't it be argued that important suites should carry on no matter what, unless their private db is somehow locked or broken? We could just warn the suite operator of the problem, and recommend a restart as soon as possible.

I think we can implement a logic where, after say 100 updates with incomplete operations, we simply copy the content of the private database into a temporary location next to the public database, and then rename that to become the new public database. Whatever connection locking the old public database will still be connected to its inode, but it should no longer affect future accesses. (Hopefully, that process will soon recover to give up the lock, and then free up the disk space.)

regarding your stress testing, out of interest do you see locks when doing the same testing on current master? Also, your gist link seems to be broken.

I can't get write to fail with a lock at all in either case. However, I think external read has become more robust with this branch.

Gist link now fixed.

the external will remain out of sync on a restart.

No, that won't be case any more, because we'll be copying the private into the public on restart.

@matthewrmshin
Copy link
Contributor Author

Implemented recovery of public database file from private database file after 100 failed attempts.

  • Reduced automatic connection time out to 0.2s.
  • Auto recovery tested with new battery test.

All other issues should be addressed.

  • Create tables if they don't exist.
  • chmod 0600 for private database.

@matthewrmshin
Copy link
Contributor Author

Branch rebased and squashed.

@hjoliver
Copy link
Member

I like the lock-breaking recovery mechanism, tested with @dpmatthews' read_lock.c.

@hjoliver
Copy link
Member

Tested with a suite of task pool size ~4500, task_pool.process_queued_db_ops() typically took about 1-1.5 seconds, no problems found.

Simplify by removing threading logic.

I take it you think there's no advantage in using a thread for db ops - even in very big, busy suites? (well it's clearly not a problem in my aforementioned test suite).

Vacuuming a db copied from one of our long-running suites (~300 MB) took almost 30 sec; might be worth printing info to stdout when doing this?

@arjclark - are you happy with the changes on this branch?

@matthewrmshin
Copy link
Contributor Author

I take it you think there's no advantage in using a thread for db ops - even in very big, busy suites? (well it's clearly not a problem in my aforementioned test suite).

For the private database, there is no real advantage, because we want it to be like the state dump - always in sync with what is current. For the public database, which does not need to be fully in sync, there is some advantage of using a separate thread/process, if writing to it becomes a bottleneck. At the moment, I don't see any advantage as it does not appear to be a bottleneck.

(If writing to disk becomes a bottleneck for a busy suite, I would imagine that the priority should be to move the logic for writing the job files to separate threads/processes (or the process pool) - writes to multiple files in separate directories, before worrying about the public suite database - writes to a single file.)

Vacuuming a db copied from one of our long-running suites (~300 MB) took almost 30 sec; might be worth printing info to stdout when doing this?

I'll add some print statements.

@hjoliver
Copy link
Member

For the private database, there is no real advantage, ...

This sort of thing should be documented in the code IMO - in case it's not obvious to future developers (or future us). Can you add a couple of comments?

* Remove threading model.
* Group insert/update statements. Write once per main loop iteration,
  with the smallest number of `executemany` calls.
* Refactor DAO to be more extensible.
* Where relevant, use single select statements to get multiple rows.
* Die immediately on any problem with private database.
* Up to 100 normal retries for writing to public database.
  * Reduce default time out to 0.2s.
  * Retry once every main loop iteration.
  * Recover from private database after 100 unsuccessful retries.
* Ensure private and public database files are in sync on restart.
* Vacuum database file on restart.
* Add new `task_jobs` table.
* Add new `broadcasts` table to record searchable broadcast settings.
  Remove `broadcast_settings` table, which stored unsearchable pickles.

New tests:
* Contents in new broadcasts table.
* Contents in new task jobs table, local and remote jobs.
* Public database recovery from lock, after less than 100 failures.
* Public database recovery using private database, after 100 failures.
To ensure compatibility with cylc#1492. Otherwise, the suite can continue as
databases may still be connected to stalled inodes.
@matthewrmshin
Copy link
Contributor Author

Can you add a couple of comments?

Done.

Branch rebased.

@hjoliver hjoliver modified the milestones: next release, soon Jun 16, 2015
@hjoliver
Copy link
Member

My test battery passes.

hjoliver added a commit that referenced this pull request Jun 16, 2015
cylc.rundb: add task jobs table and refactor
@hjoliver hjoliver merged commit e206125 into cylc:master Jun 16, 2015
@arjclark
Copy link
Contributor

@matthewrmshin - noting on ticket as discussed:

  • test battery passing in my environment
  • haven't been able to break via manual hacking around at this point
  • my only disagreements with what's now implemented are stylistic rather than functional

As discussed, to be on the safe side, I'd like this rolled out to our early adopters/general users for a while before we go for operational implementation to make sure we haven't missed anything (I can't see that we have but there's usually something)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants