Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: empty overlap set! and multiple consusnses sequnces #222

Open
sandaruwanrat opened this issue Sep 8, 2022 · 6 comments
Open

error: empty overlap set! and multiple consusnses sequnces #222

sandaruwanrat opened this issue Sep 8, 2022 · 6 comments

Comments

@sandaruwanrat
Copy link

sandaruwanrat commented Sep 8, 2022

Hello,

My question has two parts

1.When I run the command

racon -m 8 -x -6 -g -8 -w 500 cluster_1329.fasta cluster_1329_ovlp_mapping_test_fwd.paf TAIR10_chr_all.fasta > cluster_1329_tmp_consensus_test3.fasta
I am getting the following error.
[racon::Polisher::initialize] loaded target sequences
[racon::Polisher::initialize] loaded sequences
[racon::Polisher::initialize] error: empty overlap set!

However, I only get this for some of the cluster files others works fine.

Below are my minmap2 commands, i have used both paf and sam formats
minimap2 -x map-ont -t 1 -uf TAIR10_chr_all.fasta cluster_1329.fasta > cluster_1329_ovlp_mapping_test_fwd.paf

minimap2 -ax map-ont -t 1 -uf TAIR10_chr_all.fasta cluster_1329.fasta > cluster_1329_ovlp_mapping_test_fwd.sam

2. I get multiple consensus sequences

As I mentioned in the part one, racon generates consensus sequences for some cluster files for the same command

racon -m 8 -x -6 -g -8 -w 500 cluster_9.fasta cluster_9_ovlp_mapping_test_fwd.paf TAIR10_chr_all.fasta > cluster_9_tmp_consensus_test3.fasta
But the problem is there are more than one sequence (I have put example below) in the output

`>chr1 LN:i:30427560 RC:i:155 XC:f:0.000049
Sequnce

chr5 LN:i:30427560 RC:i:155 XC:f:0.000049
Sequnce
`

I would greatly appreciate your feedback on this.
Thank you very much.

@sandaruwanrat sandaruwanrat changed the title error: empty overlap set! and multile consusnses sequnces error: empty overlap set! and multiple consusnses sequnces Sep 8, 2022
@rvaser
Copy link
Collaborator

rvaser commented Sep 9, 2022

Hello,

  1. please verify if cluster_1329_ovlp_mapping_test_fwd.paf is empty.
  2. how many sequences are there in the target file TAIR10_chr_all.fasta? Not sure I understand what you are trying to achieve.

Best regards,
Robert

@sandaruwanrat
Copy link
Author

Hello Robert,

  1. Both .paf and .sam files are not empty files.

  2. TAIR10_chr_all.fasta is a genome file of Arabidopsis. It has five contigs. Following are the length of each contig.

chr1 30427671
chr2 19698289
chr3 23459830
chr4 18585056
chr5 26975502
chrM 367808

My aim is to collapse sequences in the each cluster file (ex: cluster_9.fasta) and get a consensus sequence.
Each of these sequences in a "cluster_XXX.fasta" file should belongs to same genomic region. I would like to know if I am doing something wrong.

Thank you.

Best Regards
Sandaruwan

@rvaser
Copy link
Collaborator

rvaser commented Sep 9, 2022

How big are the genomic regions of each cluster?

@sandaruwanrat
Copy link
Author

It varies. The mean length of some are 250 bp, 950bp 1.1kb, basically I have different clusters from 250bp to 2.5 kb

@rvaser
Copy link
Collaborator

rvaser commented Sep 9, 2022

And how did you obtain the clusters? You might try https://github.com/rvaser/spoa instead of Racon.

@sandaruwanrat
Copy link
Author

I obtained clusters based on UMIs, I have used https://github.com/fhlab/UMIC-seq
to get the clusters. But https://github.com/SorenKarst/longread_umi/blob/master/scripts/consensus_racon.sh have used racon to get consensus sequences from clusters.

I will try spoa instead of racon.

Thank you very much.

Best regards
Sandaruwan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants