Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step 4 breaks with underscore in chromosome names. #9

Open
aakashsur opened this issue Mar 6, 2019 · 8 comments
Open

Step 4 breaks with underscore in chromosome names. #9

aakashsur opened this issue Mar 6, 2019 · 8 comments

Comments

@aakashsur
Copy link

aakashsur commented Mar 6, 2019

So I have underscores in my chromosome names and the chrList variable is set to "LtaP_01 LtaP_02 LtaP_03 LtaP_04 LtaP_05 LtaP_06 LtaP_07 LtaP_08 LtaP_09 LtaP_10 LtaP_11 LtaP_12 LtaP_13 LtaP_14 LtaP_15 LtaP_16 LtaP_17 LtaP_18 LtaP_19 LtaP_20 LtaP_21 LtaP_22 LtaP_23 LtaP_24 LtaP_25 LtaP_26 LtaP_27 LtaP_28 LtaP_29 LtaP_30 LtaP_31 LtaP_32 LtaP_33 LtaP_34 LtaP_35 LtaP_36 MaxiA"

This leads to the following error message when Step 4 hits the KR normalization:

Traceback (most recent call last):
  File "/home/ec2-user/mHiC/bin/KR_norm_mHiC.py", line 466, in <module>
    writeInteraction(norm_mtx, baseName, args.outdir, revFragsDic, args.chrNum, args.resolution)
  File "/home/ec2-user/mHiC/bin/KR_norm_mHiC.py", line 357, in writeInteraction
    chr1, mid1 = revFragsDic[row[i]].split("_")
ValueError: too many values to unpack

Which looks to be caused because it is parsing chromosome names by underscore. Other than not using underscores in chromosome names, any ideas for a fix? I'll continue to look and see if I come up with something.

@aakashsur
Copy link
Author

I believe I've isolated the issue to these four lines

revFragsDic.append("_".join([str(chrom), str(mid)]))

mHiC/bin/KR_norm_mHiC.py

Lines 357 to 358 in 4a75988

chr1, mid1 = revFragsDic[row[i]].split("_")
chr2, mid2 = revFragsDic[col[i]].split("_")

chr, mid = revFragsDicAll[i].split("_")

By changing the join character and subsequent split characters, I'm able to resolve the issue. However it's probably better to store the logic in a data structure than in the string value. I haven't studied the code enough to figure out how it's using these names yet.

@yezhengSTAT
Copy link
Owner

Yes, we assume that there is no "_" in the chromosome name. I will add it to the manual. Thanks for pointing it out.

@bgbrink
Copy link

bgbrink commented Sep 11, 2019

Is there any way to fix the pipeline or would the better option be to modify the input files?

@yezhengSTAT
Copy link
Owner

Is there any way to fix the pipeline or would the better option be to modify the input files?

Hello,
Yes, you can replace "" in the above cited scripts by other uncommon symbol like "-" or "=". Or you can change the chromosome name (also in the corresponding reference genome files) by replacing the "" in the chromosome names by other symbol.

Thanks,
Ye

@DAljogol
Copy link

DAljogol commented Sep 16, 2019

Hi Ye,

Can you also please help me in resolving a similar issue? I'm trying to run mHiC but I get the following error: (Please note that chrList=($(seq 1 22) X Y M)) )

Traceback (most recent call last):
File "/home/software/mHiC-master/bin/KR_norm_mHiC.py", line 469, in
writeBias(bias, baseName, args.outdir, revFragsDicAll, args.chrNum, args.resolution)
File "/home/software/mHiC-master/bin/KR_norm_mHiC.py", line 374, in writeBias
chr, mid = revFragsDicAll[i].split("_")
ValueError: too many values to unpack (expected 2)

Best,
Dina

@yezhengSTAT
Copy link
Owner

Hi Ye,

Can you also please help me in resolving a similar issue? I'm trying to run mHiC but I get the following error: (Please note that chrList=($(seq 1 22) X Y M)) )

Traceback (most recent call last):
File "/home/software/mHiC-master/bin/KR_norm_mHiC.py", line 469, in
writeBias(bias, baseName, args.outdir, revFragsDicAll, args.chrNum, args.resolution)
File "/home/software/mHiC-master/bin/KR_norm_mHiC.py", line 374, in writeBias
chr, mid = revFragsDicAll[i].split("_")
ValueError: too many values to unpack (expected 2)

Best,
Dina

Hello Dina,
Thanks for using mHiC! Do you also have "_" in your chromosome name? How did you set the variable "chrList=**"?

(Now I am thinking of replacing "_" by other joint symbol in my codes......I will let you know once I finish it......)

Thanks,
Ye

@yezhengSTAT
Copy link
Owner

Hello,
I have changed the joint symbol "_" in KR_norm_mHiC.py into "=" and tested on the demo data. Let me know if there is any issue or further problem with that.

Thanks again for pointing it out! Very appreciate it!

Thanks,
Ye

@DAljogol
Copy link

Thanks Ye for your help!! It works fine now :)

Best,
Dina

Hello,
I have changed the joint symbol "_" in KR_norm_mHiC.py into "=" and tested on the demo data. Let me know if there is any issue or further problem with that.

Thanks again for pointing it out! Very appreciate it!

Thanks,
Ye

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants