Skip to content

Autocycler table

Ryan Wick edited this page Oct 29, 2024 · 20 revisions

Basics

The autocycler table command generates a TSV line from the various metrics stored in YAML files during an Autocycler assembly.

When conducting many automated Autocycler assemblies, you can use Autocycler cluster to build a TSV file containing metrics for each assembly. This can allow you to identify any samples which have assembled poorly.

Example command

For this example, I assume you have conducted many Autocycler assemblies, where each sample is in a directory that starts with SAM and there is an autocycler directory in each of those:

autocycler table > metrics.tsv  # create the TSV header
for sample in SAM*; do
    autocycler table -a "$sample"/autocycler -n "$sample" >> metrics.tsv  # append a TSV row
done

When you run autocycler table, you will likely need to change the SAM* glob to whatever will catch your samples.

Full usage

Usage: autocycler table [OPTIONS]

Options:
  -a, --autocycler_dir <AUTOCYCLER_DIR>  Autocycler directory (if absent, a header line will be output)
  -n, --name <NAME>                      Sample name [default: blank]
  -f, --fields <FIELDS>                  Comma-delimited list of YAML fields to include [default:
                                         "input_reads, pass_cluster_count, fail_cluster_count,
                                         overall_clustering_score, untrimmed_cluster_size,
                                         untrimmed_cluster_distance, trimmed_cluster_size,
                                         trimmed_sequence_length_mad, consensus_assembly_total_length,
                                         consensus_assembly_total_unitigs,
                                         consensus_assembly_fully_resolved"]
  -s, --sigfigs <SIGFIGS>                Significant figures to use for floating point numbers
                                         [default: 3]
  -h, --help                             Print help
  -V, --version                          Print version

Notes

  • The default value for --fields includes metrics that are useful for judging how well the assembly went, but you can choose from all of Autocycler's metrics.
  • Some fields will be in multiple files and will be combined into a list. For example, the trimmed_sequence_length_mad metric is stored in 2_trimmed.yaml files which are made for each cluster. Since a single assembly can have multiple clusters, there can be multiple trimmed_sequence_length_mad metrics for an assembly (one value per cluster).
  • If a YAML file is missing from the Autocycler directory, that's okay, but any metric from that file will be blank.
Clone this wiki locally