-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for Bioconductor #58 #59
Conversation
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## v0.2 #59 +/- ##
==========================================
- Coverage 93.64% 93.44% -0.21%
==========================================
Files 5 5
Lines 535 625 +90
==========================================
+ Hits 501 584 +83
- Misses 34 41 +7
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -58,6 +58,11 @@ | |||
grepl("/", pkg) | |||
} | |||
|
|||
.is_bioc <- function(pkg,bioc_version){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chainsawriot This function requires the bioc version to check if the package is in a specific release. Couldnt find any other way to check this thus far, but this would require to also add bioc_version as a parameter to .normalize_pkg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion for .normalize_pkg()
: make bioc_version optional. If it is omitted, it is assumed to be a cran package. if present, check
.normalize_pkg <- function(pkg,bioc_version = NULL) {
if (pkg == "" || is.na(pkg)) {
stop("Invalid `pkg`.", call. = FALSE)
}
if (isTRUE(.is_github(pkg))) {
if (isTRUE(grepl("github\\.com", pkg))) {
pkg <- .extract_github_handle(pkg)
}
}
if (isTRUE(.is_pkgref(pkg))) {
return(.clean_suffixes(pkg))
}
if (isTRUE(.is_github(pkg))) {
return(paste0("github::", .clean_suffixes(pkg)))
}
if(is.null(bioc_version)){
return(paste0("cran::", .clean_suffixes(pkg)))
} else{
if(isTRUE(.is_bioc(pkg,bioc_version))){
return(paste0("bioc::", .clean_suffixes(pkg)))
} else{
return(paste0("cran::", .clean_suffixes(pkg)))
}
}
}
R/resolve.R
Outdated
@@ -12,7 +12,7 @@ | |||
if (snapshot_date < attr(cached_biocver, "newest_date")) { | |||
allvers <- cached_biocver | |||
} else { | |||
# allvers <- .memo_rver() #TODO | |||
return(data.frame(version="3.16",date = "2022-11-02",rver = 4.2)) # TODO realtime check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this obviously needs to be changed eventually
R/resolve.R
Outdated
@@ -466,6 +471,13 @@ query_sysreqs <- function(rang, os = "ubuntu-20.04") { | |||
unique(unlist(res)) | |||
} | |||
|
|||
.query_sysreqs_bioc <- function(handle, os) { | |||
# TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checked all System Requirements on current release of Bioconductor and non of them have apt-get style. Very verbose content in general and not something we can every resolve. Current reease has like 200 pkgs with Sysreqs and most of them say "C++"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's vanilla C++ (probably just Rcpp or the C interface), it doesn't require any apt-get. These are all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are in the current release of bioc. So what I basically have to do is clean this output, and match it with the ones in the link you shared and add a apt install -y in front if found?
pkgs <- rang:::.memo_search_bioc("release")
tst <- pkgs$SystemRequirements
tst <- tst[!is.na(tst)]
unique(tst)
#> [1] "pandoc (>= 1.19.2.1)"
#> [2] "C++11"
#> [3] "gsl"
#> [4] "python, pytorch, numpy"
#> [5] "GNU make"
#> [6] "bcl2Fastq (versions >= 2.1.7)"
#> [7] "pandoc (http://pandoc.org/installing.html) for\ngenerating reports from markdown files."
#> [8] "JAGS (4.3.0)"
#> [9] "kallisto"
#> [10] "docker"
#> [11] "mailsend-go"
#> [12] "python (>= 2.7), sklearn, numpy, pandas, h5py"
#> [13] "Tktable, BWidget"
#> [14] "Cytoscape (>= 3.6.1) (if used for visualization of\nresults, heavily suggested)"
#> [15] "Graphviz version >= 2.2"
#> [16] "libcurl4-openssl-dev, libxml2-dev, libssl-dev,\ngfortran, build-essential, libz-dev, zlib1g-dev"
#> [17] "OpenBabel (>= 3.0.0) with headers\n(http://openbabel.org). Eigen3 with headers."
#> [18] "bowtie, samtools, and egrep are required for some\nfunctionalities"
#> [19] "Rtools (>= 3.1)"
#> [20] "Java version >= 1.7, Pandoc"
#> [21] "BUSCO (>= 5.1.3) <https://busco.ezlab.org/>"
#> [22] "Perl"
#> [23] "libxml2: libxml2-dev (deb), libxml2-devel (rpm)\nlibcurl: libcurl4-openssl-dev (deb), libcurl-devel (rpm)\nopenssl: libssl-dev (deb), openssl-devel (rpm), libssl_dev\n(csw), [email protected] (brew)"
#> [24] "C++11, GNU make"
#> [25] "GNU make, C++11"
#> [26] "xml2, GNU make, C++11"
#> [27] "Java (>= 1.8)"
#> [28] "C++14"
#> [29] "To generate html reports pandoc\n(http://pandoc.org/installing.html) is required."
#> [30] "GNU make, meme, fimo"
#> [31] "Ensembl VEP (API version 105) and the Perl modules\nDBI and DBD::mysql must be installed. See the package README\nand Ensembl installation instructions:\nhttp://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#installer"
#> [32] "C++17, GNU make"
#> [33] "Java (>= 8)"
#> [34] "ImageMagick"
#> [35] "gsl. Note: users should have GSL installed. Windows\nusers: 'consult the README file available in the inst directory\nof the source distribution for necessary configuration\ninstructions'."
#> [36] "1. C++11, 2. a graphic driver or a CPU SDK. 3. ICD\nloader For Windows user, an ICD loader is required at\nC:/windows/system32/OpenCL.dll (Usually it is installed by the\ngraphic driver). For Linux user (Except mac):\nocl-icd-opencl-dev package is required. For Mac user, no action\nis needed for the system has installed the dependency. 4. GNU\nmake"
#> [37] "JRE 8+"
#> [38] "gtkmm-2.4, GNU make"
#> [39] "JAGS 4.0.0"
#> [40] "BLAT, UCSC hg18 in 2bit format for BLAT"
#> [41] "4GB of RAM"
#> [42] "graphviz"
#> [43] "GSL and OpenMP"
#> [44] "Python (>= 3.5.0), hic-straw"
#> [45] "JAGS 4.x.y"
#> [46] "Cytoscape (>= 3.3.0), Java (>= 8)"
#> [47] "clustalo, gs, perl"
#> [48] "megadepth\n(<https://github.com/ChristopherWilks/megadepth>)"
#> [49] "Meme Suite (v5.3.3 or above)\n<http://meme-suite.org/doc/download.html>"
#> [50] "Cytoscape (>= 3.9.0) for the cytoPath() examples"
#> [51] "HMMER3"
#> [52] "GNU make, PhISCS (optional)"
#> [53] "Ensembl VEP, Samtools"
#> [54] "glpk (>= 4.57)"
#> [55] "Python (>=3), numpy, pandas, h5py, scipy, argparse,\nsklearn, mofapy2"
#> [56] "mono-runtime 4.x or higher (including System.Data\nlibrary) on Linux/macOS, .Net Framework (>= 4.5.1) on Microsoft\nWindows."
#> [57] "python"
#> [58] "bedtools (>= 2.28.0), Stereogene (>= v2.22), CapR\n(>= 1.1.1)"
#> [59] "GNU make, Bash, Perl, Gzip"
#> [60] "libxml2, libSBML (>= 5.5)"
#> [61] "MAFFT (>= 7.305), OligoArrayAux (>= 3.8), ViennaRNA\n(>= 2.4.1), MELTING (>= 5.1.1), Pandoc (>= 1.12.3)"
#> [62] "C++ software package Random Jungle"
#> [63] "Java (>= 1.6)"
#> [64] "Rcpp"
#> [65] "C++11, Rtools (>= 3.1)"
#> [66] "C++11, GNU make, samtools"
#> [67] "pandoc (>= 1.12.3)"
#> [68] "nodejs"
#> [69] "Cytoscape (>= 3.7.1), CyREST (>= 3.8.0)"
#> [70] "None"
#> [71] "Java Runtime Environment (Java>= 11)"
#> [72] "hiredis"
#> [73] "Java version >= 1.6"
#> [74] "pyGenomeTracks (prefered to use\ninstall_pyGenomeTracks())"
#> [75] "jQuery, jQueryUI, qTip2, D3js and Raphael are\nrequired Javascript libraries made available via the online\nCDNJS service (http://cdnjs.cloudflare.com)."
#> [76] "optionally Graphviz (>= 2.16)"
#> [77] "libbz2 & liblzma & libcurl (with header files), GNU\nmake"
#> [78] "Internal files Xba.CQV, Xba.regions (or other\nregions file)"
#> [79] "OpenBabel"
#> [80] "Java"
#> [81] ".NET 5.0"
#> [82] "RNASeqR only support Linux and macOS. Window is not\nsupported. Python2 is highly recommended. If your machine is\nPython3, make sure '2to3' command is available."
#> [83] "libsbml (==5.10.2)"
#> [84] "python (< 3.7), tensorflow"
#> [85] "Python (>= 3.6), leidenalg (>= 0.8.2)"
#> [86] "c++11"
#> [87] "Java (>= 1.5)"
#> [88] "Python (>= 3.5), scikit-learn (>= 0.21.2),\npackaging"
#> [89] "samtools"
#> [90] "C++17"
#> [91] "systemPipeR can be used to run external\ncommand-line software (e.g. short read aligners), but the\ncorresponding tool needs to be installed on a system."
#> [92] "Primer3 (>= 2.5.0), BLAST+ (>=2.6.0)"
#> [93] "Java Runtime Environment (>= 6)"
#> [94] "Cytoscape >= 3.9.1"
#> [95] "C++11 Windows: Dokan Linux&Mac: fuse, pkg-config"
#> [96] "git"
#> [97] "Unix, Perl (>= 5.6.0), Netpbm"
Created on 2023-02-19 with reprex v2.0.2
@chainsawriot should be ready now |
There is no caching yet. But I am checking whether caching is needed for Bioc packages in our Woody-based Docker image by manually modifying x <- resolve("bioc::BiocGenerics", snapshot_date = "2013-08-28")
dockerize(x, "irgendwo") ## ask you to cache
dockerize(x, "irgendwo", cache = TRUE) ## no cache Update: |
Ah I see, I overlooked the caching part. Working on it |
[see below] |
I agree. But for explicit cases like: |
yep hd the same though. I put that in the next commit |
I think we must enforce wget http://bioconductor.org/packages/2.12/bioc/src/contrib/BiocGenerics_0.6.0.tar.gz
--2023-02-20 19:25:23-- http://bioconductor.org/packages/2.12/bioc/src/contrib/BiocGenerics_0.6.0.tar.gz
Resolving bioconductor.org (bioconductor.org)... 18.66.147.62, 18.66.147.42, 18.66.147.31, ...
Connecting to bioconductor.org (bioconductor.org)|18.66.147.62|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/2.12/bioc/src/contrib/BiocGenerics_0.6.0.tar.gz [following]
--2023-02-20 19:25:23-- https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/2.12/bioc/src/contrib/BiocGenerics_0.6.0.tar.gz
Resolving mghp.osn.xsede.org (mghp.osn.xsede.org)... 192.69.103.248, 192.69.103.246, 192.69.103.247
Connecting to mghp.osn.xsede.org (mghp.osn.xsede.org)|192.69.103.248|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26482 (26K) [application/octet-stream]
Saving to: ‘BiocGenerics_0.6.0.tar.gz’
BiocGenerics_0.6.0.tar.gz 100%[===============================================>] 25.86K --.-KB/s in 0.1s
2023-02-20 19:25:25 (222 KB/s) - ‘BiocGenerics_0.6.0.tar.gz’ saved [26482/26482]
|
@chainsawriot ok I think it is done again, except maybe 2 points: The test coverage error: Not sure I can add any meaningful test to get rid of it. Can do some low effort one to cover more lines Should I simply change the r version reqired for caching or do another ifelse with bioc? |
@schochastics You may leave the testing to me. For the cache testing need_cache <- (isTRUE(any(grepl("^github::", .extract_pkgrefs(rang)))) &&
utils::compareVersion(rang$r_version, "3.1") == -1) ||
(isTRUE(any(grepl("^bioc::", .extract_pkgrefs(rang)))) &&
utils::compareVersion(rang$r_version, "3.3") == -1)
if (isTRUE(need_cache) && isFALSE(cache)) {
stop("Non-CRAN packages must be cached for this R version: ", rang$r_version, ". Please set `cache` = TRUE.", call. = FALSE)
} In the review, I indicated that the header.R isn't correct. This is how I modify it to make docker run. .install_packages <- function(tarball_path, lib, verbose, current_r_version) {
if (utils::compareVersion(current_r_version, "3.0") != -1) {
if (is.na(lib)) {
install.packages(pkg = tarball_path, repos = NULL, verbose = verbose, quiet = !verbose)
} else {
install.packages(pkg = tarball_path, lib = lib, repos = NULL, verbose = verbose, quiet = !verbose)
}
} else {
if (is.na(lib)) {
install.packages(pkg = tarball_path, repos = NULL)
} else {
install.packages(pkg = tarball_path, lib = lib, repos = NULL)
}
}
}
.download_package <- function(tarball_path, x, version, handle, source, uid, verbose, cran_mirror, bioc_mirror) {
if (source == "github") {
return(.download_package_from_github(tarball_path, x, version, handle, source, uid))
}
if (source == "bioc") {
url <- paste(bioc_mirror, "bioc/src/contrib/", x, "_", version, ".tar.gz", sep = "")
}
if (source == "cran") {
url <- paste(cran_mirror, "src/contrib/Archive/", x, "/", x, "_", version, ".tar.gz", sep = "")
}
tryCatch({
suppressWarnings(download.file(url, destfile = tarball_path, quiet = !verbose))
}, error = function(e) {
## is the current latest
url <- paste(cran_mirror, "src/contrib/", x, "_", version, ".tar.gz", sep = "")
download.file(url, destfile = tarball_path, quiet = !verbose)
})
invisible(tarball_path)
}
.tempfile <- function(tmpdir = tempdir(), fileext = ".tar.gz") {
file.path(tmpdir,
paste(paste(sample(c(LETTERS, letters), 20, replace = TRUE), collapse = "")),
fileext)
}
.build_raw_tarball <- function(raw_tarball_path, x, version, tarball_path) {
tmp_dir <- .tempfile(fileext = "")
dir.create(tmp_dir)
system(command = paste("tar", "-zxf ", raw_tarball_path, "-C", tmp_dir))
pkg_dir <- list.files(path = tmp_dir, full.names = TRUE)[1]
new_pkg_dir <- file.path(tmp_dir, x)
file.rename(pkg_dir, new_pkg_dir)
res <- system(command = paste("R", "CMD", "build", new_pkg_dir))
expected_tarball_path <- paste(x, "_", version, ".tar.gz", sep = "")
stopifnot(file.exists(expected_tarball_path))
file.rename(expected_tarball_path, tarball_path)
return(tarball_path)
}
.install_from_source <- function(x, version, handle, source, uid, lib,
path = tempdir(), verbose, cran_mirror, bioc_mirror, current_r_version) {
tarball_path <- file.path(path, paste(x, "_", version, ".tar.gz", sep = ""))
raw_tarball_path <- file.path(path, paste("raw_", x, "_", version, ".tar.gz", sep = ""))
if (!file.exists(tarball_path) && !file.exists(raw_tarball_path)) {
.download_package(tarball_path = tarball_path, x = x, version = version, handle = handle, source = source,
uid = uid, verbose = verbose, cran_mirror = cran_mirror, bioc_mirror = bioc_mirror)
}
if (file.exists(raw_tarball_path)) {
tarball_path <- .build_raw_tarball(raw_tarball_path, x = x, version = version, tarball_path)
if (!file.exists(tarball_path)) {
stop("building failed.")
}
}
.install_packages(tarball_path, lib, verbose, current_r_version)
## check and error
if (!is.na(lib)) {
installed_packages <- installed.packages(lib.loc = lib)
} else {
installed_packages <- installed.packages()
}
if (!x %in% dimnames(installed_packages)[[1]]) {
stop("Fail to install ", x, "\n")
}
invisible()
}
# installing github packages
.download_github_safe <- function(handle, sha, file) {
tryCatch(
download.file(paste("http://api.github.com/repos/", handle, "/tarball/", sha, sep = ""), destfile = file),
error = function(e) {
stop(paste("couldn't download ", handle, " from github", sep = ""), call. = FALSE)
}
)
}
.tempfile <- function(tmpdir = tempdir(), fileext = ".tar.gz") {
file.path(tmpdir,
paste(paste(sample(c(LETTERS, letters), 20, replace = TRUE), collapse = ""), fileext, sep = ""))
}
.download_package_from_github <- function(tarball_path, x, version, handle, source, uid) {
sha <- uid
short_sha <- substr(sha, 1, 7)
dest_tar <- .tempfile(fileext = ".tar.gz")
tmp_dir <- tempdir()
tryCatch(
download.file(paste("https://api.github.com/repos/", handle, "/tarball/", sha, sep = ""), destfile = dest_tar),
error = function(e) {
.download_github_safe(handle, sha, dest_tar)
}
)
system(command = paste("tar", "-zxf ", dest_tar, "-C", tmp_dir))
dlist <- list.dirs(path = tmp_dir, recursive = FALSE) ## TODO list.dirs is 2.14
pkg_dir <- dlist[grepl(short_sha, dlist)] ## TODO grepl is 2.9.0
if(length(pkg_dir) != 1) {
stop(paste("couldn't uniquely locate the unzipped package source in ",tmp_dir, sep = ""))
}
res <- system(command = paste("R", "CMD", "build", pkg_dir), intern = TRUE)
expected_tarball_path <- paste(x, "_", version, ".tar.gz", sep = "")
if (!file.exists(expected_tarball_path)) {
stop("Cannot locate the built tarball.")
}
file.rename(from = expected_tarball_path, to = tarball_path)
return(tarball_path)
}
## In Unix, all things are file.
## Before you complain, R <= 3.2.0 doesn't have dir.exists.
if (file.exists("cache")) {
path <- "cache"
} else {
path <- tempdir()
}
if (nrow(installation_order) >= 1) {
for (i in seq(from = 1, to = nrow(installation_order), by = 1)) {
x <- installation_order$x[i]
source <- installation_order$source[i]
version <- installation_order$version[i]
handle <- installation_order$handle[i]
uid <- installation_order$uid[i]
.install_from_source(x = x, version = version, handle = handle, source = source, uid = uid,
lib = lib, path = path, verbose = verbose,
cran_mirror = cran_mirror, bioc_mirror = bioc_mirror,
current_r_version = current_r_version)
}
}
|
BTW, I suggest writing |
ok hope all suggestions are in. Sorry for the header/footer miss. Edit: well guess I broke something |
Ok the as_pkgrefs implementation is still missing for bioc |
third time's the charm: I hope everything is done now aaah forgot this: #58 (comment) Nice, found this |
fourth time's... |
@schochastics Yes, I will do the testing. |
@chainsawriot Draft so you can monitor the progress