Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't assume all CRAN packages does not import bioc packages #85

Closed
chainsawriot opened this issue Feb 24, 2023 · 10 comments
Closed

Can't assume all CRAN packages does not import bioc packages #85

chainsawriot opened this issue Feb 24, 2023 · 10 comments

Comments

@chainsawriot
Copy link
Collaborator

https://cran.r-project.org/web/packages/restfulr/index.html

@chainsawriot
Copy link
Collaborator Author

.query_snapshot_dependencies_cran("restfulr", snapshot_date = "2023-01-01")
#   snapshot_date        x x_version           x_pubdate       x_pkgref
##1     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##2     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##3     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##4     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##5     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##6     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##7     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##8     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##9     2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##10    2023-01-01 restfulr    0.0.15 2022-06-16 10:44:34 cran::restfulr
##           y     type y_raw_version        y_pkgref
####1        XML  Imports             *       cran::XML
##2      RCurl  Imports             *     cran::RCurl
##3      rjson  Imports             *     cran::rjson
##4  S4Vectors  Imports    >= 0.13.15 cran::S4Vectors
##5       yaml  Imports             *      cran::yaml
##6          R  Depends      >= 3.4.0         cran::R
##7    methods  Depends             *   cran::methods
##8    getPass Suggests             *   cran::getPass
##9      rsolr Suggests             *     cran::rsolr
##10     RUnit Suggests             *     cran::RUnit
 

cran::S4Vectors is not a correct pkgref. And this requirement can never be fulfilled. Another reason for the infinite loop.

The question is how to solve this.

@chainsawriot
Copy link
Collaborator Author

bioc_pkgs <- rownames(available.packages(repos = "https://bioconductor.org/packages/release/bioc", filters = list()))
unique(unlist(tools::package_dependencies(bioc_pkgs, reverse = TRUE)))

@schochastics
Copy link
Member

schochastics commented Feb 24, 2023

hmm that doesnt seem to include all packages though?

length(bioc_pkgs)
# 2165

This says 2183

Wouldn't the simplest fix be to just always query bioc?
Wait, we only query bioc on top level (ie x) not y, thats the issue, right?

@schochastics
Copy link
Member

.query_snapshot_dependencies_cran <- function(handle = "rtoot", snapshot_date = "2022-12-10") {
    snapshot_date <- anytime::anytime(snapshot_date, tz = "UTC", asUTC = TRUE)
    search_res <- .memo_search(handle)
    search_res$pubdate <- anytime::anytime(search_res$crandb_file_date, tz = "UTC", asUTC = TRUE)
    snapshot_versions <- search_res[search_res$pubdate <= snapshot_date,]
    if (nrow(snapshot_versions) == 0) {
        stop("No snapshot version exists for ", handle, ".",  call. = FALSE)
    }
    latest_version <- utils::tail(snapshot_versions[order(snapshot_versions$pubdate),], n = 1)
    dependencies <- latest_version$dependencies[[1]] 
    .is_bioc(dependencies$package,bioc_version) #LOOK HERE
    #change rest accordingly
    if (nrow(dependencies != 0)) {
        return(data.frame(snapshot_date = snapshot_date, x = handle, x_version = latest_version$Version, x_pubdate = latest_version$pubdate, x_pkgref = .normalize_pkgs(handle), y = dependencies$package, type = dependencies$type, y_raw_version = dependencies$version, y_pkgref = .normalize_pkgs(dependencies$package)))
    } else {
        ## no y
        return(data.frame(snapshot_date = snapshot_date, x = handle, x_version = latest_version$Version, x_pubdate = latest_version$pubdate, x_pkgref = .normalize_pkgs(handle)))
    }
}

But this would mean we either have to pass the bioc version to the function, or query it again since snapshot is present. since we use memoise, this shouldnt be an issue

@chainsawriot
Copy link
Collaborator Author

The most accurate list is probably

.memo_search_bioc("release")$Package

@chainsawriot
Copy link
Collaborator Author

With 2667893 it can now be demonstrated.

x <- resolve("restfulr")
x$ranglet[[1]]$unresolved_deps

And can be used as a test to the solution of this.

@chainsawriot
Copy link
Collaborator Author

If .memo_search_bioc has memoisation, it doesn't sound too bad to me to always enable query_bioc and always do .normalize_pkgs with bioc_version. Every resolve, even just for CRAN packages, makes 2-4 requests to Bioconductor.

All .query_snapshot_dependencies_* functions need to take bioc_version.

@schochastics
Copy link
Member

@chainsawriot do you want to implement this? Otherwise I can do it too

@chainsawriot
Copy link
Collaborator Author

@schochastics I want to implement this. I am still on this bug hunting quest to beat bioc::Organism.dplyr. I want to beat this miniboss!

@schochastics
Copy link
Member

haha go for it :D, i go back to renv then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants