Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
molNum100000.tsv		molNum100000.tsv
molNum1000000.tsv		molNum1000000.tsv
molNum2000000.tsv		molNum2000000.tsv
plots.html		plots.html
scalability_plots.svg		scalability_plots.svg

README.md

Calib thread scalability

Calib is multithreaded. However, its time and memory efficiency drops with increasing number of threads:

Currently, multithreading is performed by generating all possible masks on the main thread. For barcode length 4 (4+4), and error tolerance of 1, there are eight such masks:

Any two nodes in Calib graph that have identical barcodes after applying a mask, and after having sufficient number of minimizers (check the main README file here). Thus, each mask implies a set of edges on Calib's graph. Note that these sets of edges are not necessarily disjoint, especially of error tolerance is > 1.

Calib's multithreading is done by assigning each thread a subset of masks (no two threads will share a subset of course). Each thread will then use its own masks to generate a local version of the graph that is implied by these masks. Once a thread completes computing its graph on all its masks, it will lock the main Calib's graph, and copy its local graph's edges to the main graph's edges. This comes with the downside of using more RAM with every extra thread used as shown in the tests plotted up here. The plot is generated from the scripts in here. The exact results are in the the TSV files in this directory.

Running simulated datasets tests

Please check the testing script available [here](

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scalability

scalability

README.md

Calib thread scalability

Running simulated datasets tests

Files

scalability

Directory actions

More options

Directory actions

More options

Latest commit

History

scalability

Folders and files

parent directory

README.md

Calib thread scalability

Running simulated datasets tests