Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matched selection for pair of RIs for subTAD and TAD. #7

Open
X-xiaoyeren opened this issue Feb 8, 2022 · 1 comment
Open

Matched selection for pair of RIs for subTAD and TAD. #7

X-xiaoyeren opened this issue Feb 8, 2022 · 1 comment

Comments

@X-xiaoyeren
Copy link

X-xiaoyeren commented Feb 8, 2022

Hi,
I have read your great manuscript on Genome Research.
I'm very curious about the hierarchical TAD, and I got the domains in every RI successfully. Next I pick the RI for subTAD and TAD, give the Level 1 to TADs and the Level 2 to subTADs, and finally merge them together.

Then I find a big problem: the subTADs aren't always totally in TADs, probably resulting from a long range for RI, such as RI > 55% for subTAD and RI > 69% for TAD. So one of the possible solutions may be to find the best pair of RI for subTAD and TAD.

Since it takes a long long time for me to find matched RI for both subTAD and TAD, it would be impossible for large samples. If this step couldn't be solved, it would be a great pity for such a comprehensible and user-friendly method !!!

Thus, could you please give me some suggestions or help me find the solutions?

Any reply will be helpful. Thanks a lot!

@X-xiaoyeren X-xiaoyeren changed the title Matched selectiong for pair of RIs for subTAD and TAD. Matched selection for pair of RIs for subTAD and TAD. Feb 8, 2022
@zhanyinx
Copy link
Owner

zhanyinx commented Feb 9, 2022

Hi,

Thanks for your appreciation of the method.
Since the boundaries of domains are not always well defined, at each iteration CaTCH allows small adjustment of boundaries position. This is why "subTADs" (or in general domains at lower RI) are not always within "TADs" (or domains at higher RI).

Regarding your trouble, can you please give me some clarifications?

  1. Why do you merge Level 1 and Level 2? (TADs and subTADs).
  2. Did you use the Hi-C dataset from the manuscript or do you have your own one? If you use your own one, probably it's better if identifies TADs and sub-TADs using the methods provided in the manuscript: ~180kb size for subTADs, and optimal functional properties for TADs (enrichment of CTCF for instance). If you can use the "optimal functional properties" criteria, you could find TADs based on size (800kb-1Mb for mouse).
  3. if I understood correctly, you want to find which subTADs are within which TAD right? if this is the goal, you can probably create GRange objects from your list of TADs and subTADs and use the findOverlaps function to find overlaps. This function will return multiple hits if your subTAD is not within a TAD. In this case, you can use the pintersect function to find the maximum overlapping hit.

lemme know if this helps
Best
Yinxiu Zhan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants