Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline - draft1 #1

Open
wants to merge 65 commits into
base: dev
Choose a base branch
from
Open

Pipeline - draft1 #1

wants to merge 65 commits into from

Conversation

prototaxites
Copy link
Contributor

@prototaxites prototaxites commented Nov 26, 2024

Adds:

  • Input from YAML
  • Assembly
    • metaMDBG
    • assembly statistics/num circular
  • Read mapping to assembly - hi-c and pacbio
  • Binning with
    • metabat2
    • maxbin2
    • bin3c
    • metator
  • Bin refinement
    • gene prediction with pyrodigal
    • magscot
    • dastool
  • Bin QC
    • Checkm
    • Seqkit statistics
    • infernal + trnascan-se - count trnas
  • Taxonomy
    • GTDBTk
    • TaxKit - get NCBI taxids
  • Summary
    • assembly statistics
    • bin statistics and scoring

Jim Downie and others added 9 commits November 14, 2024 14:23
- Add sketch binning subworkflow
- Map reads + hic to reference
- Patch read mapping modules so they output the meta object
from the reference
- Calculate depths
- binnng workflow has metabat2,maxbin2 and bin3c
- bin3c modules
- read hic enzymes input from YAML
- move hic cram preprocessing to new PREPARE_DATA workflow

Minor updates:
- Patch MaxBin2 to emit ungzipped fasta
Skeleton of binning with Metator - needs uncontaminated container to
    continue.
    - Bin refinement with Magscot

Fixes:
    - Binning with Metator now functional
@prototaxites prototaxites self-assigned this Nov 26, 2024
@prototaxites prototaxites marked this pull request as ready for review November 26, 2024 15:52
Merge pull request #1 from prototaxites/main
prototaxites added a commit that referenced this pull request Nov 27, 2024
@prototaxites prototaxites changed the title Pipeline skeleton - input, assembly, binning and bin refinement Pipeline - draft1 Nov 28, 2024
Jim Downie and others added 8 commits January 6, 2025 16:58
- circular contig counting
    - counts circular contigs using custom module
    - custom genome stats module that incorporates circular counts
- basic bin scoring
- new bin summary output - per group summary
- remove collate_bins parameter and just collate bins for gtdbtk an
  checkm2
@prototaxites prototaxites marked this pull request as ready for review January 16, 2025 15:10
README.md Show resolved Hide resolved
Copy link
Contributor

@weaglesBio weaglesBio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just need to resolve a few comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants