Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimateAlpha #460

Merged
merged 62 commits into from
Jul 5, 2024
Merged

estimateAlpha #460

merged 62 commits into from
Jul 5, 2024

Conversation

ChouaibB
Copy link
Contributor

Initial draft implementation.

@ChouaibB ChouaibB linked an issue Oct 13, 2023 that may be closed by this pull request
@ChouaibB ChouaibB requested a review from antagomir October 13, 2023 14:51
@antagomir
Copy link
Member

Related to #417

R/estimateAlphaWithRarefaction.R Outdated Show resolved Hide resolved
R/estimateAlphaWithRarefaction.R Outdated Show resolved Hide resolved
R/estimateAlphaWithRarefaction.R Outdated Show resolved Hide resolved
tests/testthat/test-10estimateAlphaWithRarefaction.R Outdated Show resolved Hide resolved
@antagomir
Copy link
Member

I am tending to think that we should combine all these estimateXXX functions into a single estimateAlpha function as your PR suggest. The other estimateXXX functions could be turned into internal functions to simplify the interface. That estimateAlpha function can then have rarefaction as optional argument (including the user-specified rarefaction depth; by default the lowest read count among the samples).

Things to consider:

  1. previously exported estimateXXX functions would need deprecation message when they are moved into internal functions
  2. the diversity index names should have suffixes, like "shannon_diversity", "observed_richness" etc.
  3. there is a mechanism available in R that can pick those argument names even when just the (unique) start of the index name is given, like "shannon" would be recognized to mean "shannon_diversity" so the users can use shorter names if they need to but the output need to use the full names (for instance Simpson index is available both for diversity and evenness, therefore it must be indicated which one it is).

If unclear, we can discuss more.

I have some further comments in mind after the next PR draft on this.

@antagomir
Copy link
Member

The function has to be changed into estimateAlpha, which then includes rarefaction as an optional argument

@antagomir
Copy link
Member

An idea: the method name (like "shannon") could also say "diversity", then the function would return all "diversity" indices and if one asks for "dominance" then those etc (richness / evenness / diversity / dominance / rarity are the options and the indices are listed in those corresponding functions now). If no method is specific, then all indices would be returned by default.

@TuomasBorman
Copy link
Contributor

I think this is not the best way to implement this. Better way is to keep those estimateDiversity.R and other files and just add estimateAlpha.R with estimateAlpha functionality (remove documentation from necessary places)

The problem with moving all the code to one file is that it is a nightmare for someone that tries to check what has been changed. It looks like everything has

@antagomir
Copy link
Member

antagomir commented Oct 18, 2023

I agree. The old and extensively tested functions can be kept as is. They should be just converted into internal functions that are not exported (as I already explictily suggested in the opening statement of this issue), and then called from a wrapper called estimateAlpha(). This should be also faster to implement than rewriting the entire function.

@ChouaibB
Copy link
Contributor Author

Alright, I misunderstood the comment I will make the changes.

@antagomir
Copy link
Member

Always aim at minimal and straightfwd changes unless otherwise agreed.

If you could update this PR accordingly that would be great!

@ChouaibB
Copy link
Contributor Author

An idea: the method name (like "shannon") could also say "diversity", then the function would return all "diversity" indices and if one asks for "dominance" then those etc (richness / evenness / diversity / dominance / rarity are the options and the indices are listed in those corresponding functions now). If no method is specific, then all indices would be returned by default.

I didn't understand, could you please explain more.

@ChouaibB
Copy link
Contributor Author

ChouaibB commented Oct 19, 2023

An idea: the method name (like "shannon") could also say "diversity", then the function would return all "diversity" indices and if one asks for "dominance" then those etc (richness / evenness / diversity / dominance / rarity are the options and the indices are listed in those corresponding functions now). If no method is specific, then all indices would be returned by default.

I didn't understand, could you please explain more.

Do you mean smt like this?

.get_indices <- function(measure) {
    switch(measure,
           "diversity" = c("coverage_diversity", "coverage",
                           "faith_diversity", "faith",
                           "fisher_diversity", "fisher",
                           "gini_simpson_diversity", "gini_simpson",
                           "inverse_simpson_diversity", "inverse_simpson",
                           "log_modulo_skewness_diversity", "log_modulo_skewness",
                           "shannon_diversity", "shannon"),
           "dominance" = c("absolute_dominance", "absolute",
                           "dbp_dominance", "dbp",
                           "core_abundance_dominance", "core_abundance",
                           "gini_dominance", "gini", 
                           "dmn_dominance", "dmn",
                           "relative_dominance", "relative",
                           "simpson_lambda_dominance", "simpson_lambda"),
           "evenness" = c("camargo_evenness", "camargo",
                          "pielou_evenness", "pielou",
                          "simpson_evenness",
                          "evar_evenness", "evar",
                          "bulla_evenness", "bulla"),
           "richness" = c("ace_richness", "ace",
                          "chao1_richness", "chao1",
                          "hill_richness", "hill",
                          "observed_richness", "observed"))
}

Copy link
Member

@antagomir antagomir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Some comments added. Perhaps TB has more

R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
tests/testthat/test-10estimateAlpha.R Outdated Show resolved Hide resolved
tests/testthat/test-10estimateAlpha.R Outdated Show resolved Hide resolved
tests/testthat/test-10estimateAlpha.R Outdated Show resolved Hide resolved
tests/testthat/test-10estimateAlpha.R Outdated Show resolved Hide resolved
Copy link
Contributor

@TuomasBorman TuomasBorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COding style should more or less match between functions even though we have multiple developers. For example there is no many comments

R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
R/estimateAlpha.R Outdated Show resolved Hide resolved
@TuomasBorman TuomasBorman changed the title Regarding the issue 417 estimateAlpha Oct 20, 2023
@TuomasBorman
Copy link
Contributor

I think this is good to go. @ake123 Do you have time to check that there are necessary unit tests (there might be already)? Check Leo's comments about testing whether rarefying was disabled (n.iter = NULL and n.iter = 0)

@ake123
Copy link
Collaborator

ake123 commented Jul 4, 2024

Sure I can check that

I think this is good to go. @ake123 Do you have time to check that there are necessary unit tests (there might be already)? Check Leo's comments about testing whether rarefying was disabled (n.iter = NULL and n.iter = 0)

R/addAlpha.R Outdated Show resolved Hide resolved
@TuomasBorman
Copy link
Contributor

Fixed some issues and updated documentation. Would be nice to have someone else's thoughts (if everything works) but I think this is good now

@Daenarys8 Daenarys8 mentioned this pull request Jul 5, 2024
@TuomasBorman
Copy link
Contributor

One last thing: I changed n.iter to niter to harmonize the paramater names. nsomething is used to denote number of something. We have already nclasses and ndimred.

If someone is against this change, I'm happy to discuss about this

@TuomasBorman TuomasBorman requested a review from antagomir July 5, 2024 07:41
Copy link
Member

@antagomir antagomir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ake123
Copy link
Collaborator

ake123 commented Jul 5, 2024

It seems the unit test have been updated with niter=NULL, niter=0 as default and with no errors.

@TuomasBorman
Copy link
Contributor

@ake123 Can you check that OMA and mia* packages are updated?

@TuomasBorman TuomasBorman merged commit a340211 into devel Jul 5, 2024
3 checks passed
@TuomasBorman TuomasBorman deleted the issue_417 branch July 5, 2024 12:19
@ake123
Copy link
Collaborator

ake123 commented Jul 5, 2024

Yes sure! Should I add section addAlpha in the diversity section?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rarefaction as optional strategy for alpha diversity
6 participants