Skip to content

Partitioning reads

Ryan Wick edited this page May 19, 2020 · 26 revisions

Requirements

Before this step, you'll need to have completed the previous step (multiple sequence alignment) for each of your good clusters. You'll also need the same long-read set you used in the previous steps.

Concept

Now that you have reconciled and aligned sequences for each cluster, this step will partition your reads between these clusters. I.e. each read will be assigned to whichever cluster it best aligns and saved into a file for that cluster.

This step is run once for your entire isolate (i.e. not on a per-cluster basis).

Running Trycycler partition

This step takes your cluster directories as input, each of which must have the 02_all_seqs.fasta file made by the previous step (Trycycler reconcile).

Assuming you have deleted all of the bad cluster directories (i.e. the only cluster directories left are the good ones on which you've run Trycycler reconcile), this command would do the trick:

trycycler-runner.py partition --cluster_dirs trycycler/cluster_* --reads reads.fastq

Note that the star in that command is a glob which will expand to all the cluster directories. You could also explicitly list the cluster directories like this (assuming your good clusters are numbers 1, 7 and 8):

trycycler-runner.py partition --cluster_dirs trycycler/cluster_001 trycycler/cluster_007 trycycler/cluster_008 --reads reads.fastq

Settings

Trycycler partition has the following parameters you can adjust:

  • --min_aligned_len: reads with less than this many bases aligned (default = 1000) will be ignored.
  • --min_read_cov: reads with less than this percentages of their length covered by alignments (default = 90.0) will be ignored.
  • --threads: this is how many threads Trycycler will use for read alignment. It will only affect the speed performance, so you'll probably want to use as many threads as you have available.

Output

After Trycycler partition completes, each of the cluster directories should have a 4_reads.fastq file which contains its share of the total reads.

Clone this wiki locally