You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some stats programs have things like kmer (with -K) reports and probe-id
counting (with -D).
These programs can consume a lot of RAM (>10GB), even with the highly efficient
sparsehash library on very large files (> 200 mil reads).
The use of a disk-backed key-value store, like levelDB could see decent
performance, like a hash, but would also allow growth past available RAM with
decent performance. I'm thinking that the code should switch to a DB-backed
store at the 200 mil record level. This would slow things down by about 3x
(from 1 mil writes/sec to 300k writes/sec), but would also allow infinte
growth. Enabling a large LRU cache could it perform so similarly that the
sparse hash can be abandoned, especially if the db remains an insigificant
fraction of the stats collection process.
Original issue reported on code.google.com by [email protected] on 9 Jul 2014 at 2:26
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 9 Jul 2014 at 2:26The text was updated successfully, but these errors were encountered: