Skip to content
This repository has been archived by the owner on Oct 13, 2022. It is now read-only.

0.6.0

Compare
Choose a tag to compare
@bagashe bagashe released this 06 Oct 15:42
· 1372 commits to master since this release

Additions and improvements since release 0.5.0

  • Option --ReadGraph.creationMethod 2 activates a more robust way to create the read graph. It uses the statistical distribution of various alignment metrics to select alignment criteria - see the documentation for more details. Use with one of the new sample configuration files (see below). It provides the following benefits:
    • Less sensitive to choice of alignment metric thresholds, as long as those thresholds are chosen in a very permissive way.
    • Improves assembly contiguity and accuracy.
    • Less sensitive to the amount of available coverage. Works well down to 20X coverage using the sample configuration files provided, although assembly contiguity decreases with coverage.
  • Experimental iterative assembly functionality for partial haplotype separation (phased diploid assembly) and improved resolution of long repeats such as segmental duplications. It currently requires Ultra-Long (UL) reads and high coverage, 80X. Use with configuration file Nanopore-UL-iterative-Sep2020.conf.
  • Option --MarkerGraph.minCoverage can now be set to 0 for automatic selection of a reasonable value.
  • Option --MarkerGraph.minCoveragePerStrand can be used to specify a minimum required per-strand coverage (number of supporting reads) for a marker graph vertex to be generated. This can reduce assembly errors due to strand-dependent systematic errors.
  • Option --ReadGraph.desiredCoverage can be used to automatically increase the read length cutoff to reduce coverage to a desired value.
  • Option --Assembly.detangleMethod 2 can be used to select a less conservative detangling method, which is also configurable with various new command line options.
  • Memory optimization results in significant reductions memory requirements. Peak virtual memory usage is now reported at the end of an assembly and in AssemblySummary.html.
  • Support for the ARM platform (see below under Platforms for more information).
  • New script GenerateConfig.py aids in creating a custom configuration file.
  • New script GenerateFeedback.py can be used to assess a completed assembly. When filing a Shasta issue for an unsatisfactory assembly, please include the output of this script plus AssemblySummary.html.
  • Documentation and benchmarks to permit running on machines with less than the ideal amount of memory.
  • New sample configuration files, all of which include --ReadGraph.creationMethod 2. Use with Shasta option --config.
    • Nanopore-Sep2020.conf best currently known parameter set for standard nanopore reads generated by the Guppy base caller version 3.6.0 or later.
    • Nanopore-UL-Sep2020.conf best currently known parameter set for Ultra-Long (UL) nanopore reads generated by the Guppy base caller version 3.6.0 or later.
    • Nanopore-OldGuppy-Sep2020.conf best currently known parameter set for standard nanopore reads generated by the Guppy base caller versions 3.0 through 3.5.
    • Nanopore-UL-iterative-Sep2020.conf experimental configuration file for iterative assembly using high coverage (80X) with Ultra-Long (UL) nanopore reads generated by the Guppy base caller version 3.6.0 or later. Provides partial haplotype separation (phased diploid assembly) and improved resolution of segmental duplications.
  • Usability improvements.
  • Improvements and additions in the HTTP server.
  • Documentation improvements and additions, including significant additions to the page on Shasta computational methods.

Platforms

Linux

  • The shasta-Linux-0.6.0 executable will run on most current 64-bit Linux systems that use kernel version 3.2.0 or later. This includes all Ubuntu versions starting at 12.04 plus CentOS 7 and 8.

  • The shasta-OldLinux-0.6.0 executable will run on most current 64-bit Linux systems that use kernel version 2.6.32 or later. This includes CentOS 6. CentOS 6 reaches end of support on November 30, 2020, and kernel versions older than 3.2.0 are aging and no longer widely used or supported. Therefore, the shasta-OldLinux executable will not be included in future Shasta releases. Future Shasta releases will only run on systems that use Linux kernel 3.2.0 or later. They will not run on older systems, including CentOS 6.

macOS

In contrast with previous Shasta releases, in this release a single macOS executable is provided, shasta-macOS-0.6.0. This executable can be used both on macOS 10.14 (Mojave) and macOS 10.15 (Catalina).

Windows

As in previous releases, the Linux executable shasta-Linux-0.6.0 can be used on Windows under Windows Subsystem for Linux (WSL).

ARM

This Shasta release includes an ARM executable, shasta-Linux-ARM-0.6.0, which can be used on 64-bit ARM version 8 platforms. It is known to work at least in the following environments:

  • Graviton2 processors running 64-bit Ubuntu 20.04 on AWS instance types r6g and m6g.
  • Raspberry Pi Model 4 running 64-bit Ubuntu 20.04.

Compatibility

This release is not compatible with previous releases. There were incompatible changes in some command line option names, the binary formats used, and the Python API. You cannot use release 0.6.0 for postprocessing of an assembly done using a previous release.