Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No fasta file input argument in co-assembly binning using SemiBin2 train_self #182

Open
Ijingyuliu opened this issue Jan 16, 2025 · 0 comments

Comments

@Ijingyuliu
Copy link

Error output:
SemiBin2: error: unrecognized arguments: -i AU_KOA_F_contig.fa
By running
SemiBin2 train_self -h
the output contains no -i comand, while in the tutoriols, https://semibin.readthedocs.io/en/latest/usage/, it has the -i fasta input in example for Advanced co-assembly binning workflows, I wondered which one is correct, and do I need to input the contig files for the model training in co-assembly?

Tutorials

SemiBin2 train_self \ -i contig.fa \ --data contig_output/data.csv \ --data-split contig_output/data_split.csv \ -o contig_output

Output

usage: SemiBin2 train_self [-h] --data [DATA ...] --data-split [DATA_SPLIT ...] -o OUTPUT [--batch-size BATCHSIZE] [--train-from-many | --no-train-from-many] [--epochs EPOCHES] [--tmpdir TMPDIR] [-p] [--verbose | --quiet] [--random-seed RANDOM_SEED]
                           [--engine ENGINE]

options:
  -h, --help            show this help message and exit
  --batch-size BATCHSIZE
                        Batch size used in the training process (Default: 2048).
  --train-from-many, --no-train-from-many
                        Train the model with several samples. You must provide data, data_split, cannot, and fasta files for corresponding samples in the same order. Note: You can only use `--train-from-many` mode when performing single-sample
                        binning. Training from many samples with multi-sample binning is not supported.
  --epochs EPOCHES, --epoches EPOCHES
                        Number of epochs used in the training process (Default: 15).
  --tmpdir TMPDIR       option to set temporary directory
  -p , --processes , -t , --threads 
                        Number of CPUs used (pass the value 0 to use all CPUs, default: 0)
  --verbose             Verbose output
  --quiet, -q           Quiet output
  --random-seed RANDOM_SEED
                        Random seed. Set it to a fixed value to reproduce results across runs. The default is that the seed is set by the system and .
  --engine ENGINE       device used to train the model (auto/gpu/cpu, auto means if SemiBin detects the gpu, SemiBin will use GPU)

Mandatory arguments:
  --data [DATA ...]     Path to the input data.csv file.
  --data-split [DATA_SPLIT ...]
                        Path to the input data_split.csv file.
  -o OUTPUT, --output OUTPUT
                        Output directory (will be created if non-existent)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant