Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with toppct ratio/aggregate and bottompct ratio/agregate being identical #2

Open
acortmann opened this issue Sep 10, 2024 · 1 comment

Comments

@acortmann
Copy link

I've been applying tempted to my longitudinal dataset and generally am finding it very useful.

When running tempted_all and absolute=TRUE or FALSE, the list of OTUs included in the top and bottom lists appears to be identical for both the toppct ratio and bottompct ratio regardless of how absolute is set and regardless of the percentage set. The same happens with the metafeature aggregate lists if the values are set less than 1.

Based on reading the documentation, I thought this list would differ between the top and bottom. Looking at the included OTUs compared to the PC loadings, it looks like the included OTUs are only those with positive loadings.

Since the ratios in the metafeature_ratio and the metafeature_aggregate appear to make sense, it isn't clear if those are calculated using the correct OTUs or not.

Is the bottompct ratio/aggregate list pulling the wrong data from the analysis.

Thanks for your help.

@pixushi
Copy link
Owner

pixushi commented Sep 26, 2024

Thanks for your interest in our method!

In the current version, aggregate_feature() does not take the option of absolute=TRUE/FALSE, so metafeature_aggregate and toppct_aggregate are not affected by absolute=TRUE/FALSE. We will change it in the next update to include this option.

For ratio_feature(), it should return different results for absolute=TRUE/FALSE. I ran tempted_all() on the example data with absolute=TRUE and absolute=FALSE respectively, and it returned different results for metafeature_ratio, toppct_ratio, and bottompct_ratio. One possible reason for absolute=TRUE/FALSE to make no difference in your data is that the feature loadings are very symmetric around zero, so ranking them by their absolute values or signed values will lead to the same features being picked from the top/bottom of the list. Does this answer your question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants