Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WISH: R CMD check to assert that DLLs are unregistered when package is unloaded #29

Open
HenrikBengtsson opened this issue Jul 11, 2016 · 3 comments
Labels
annoyance enhancement on r-devel or r-pkg-devel mailing lists Issue has been raised on the R-devel or R-pkg-devel mailing lists package-building

Comments

@HenrikBengtsson
Copy link
Owner

HenrikBengtsson commented Jul 11, 2016

Background

Packages with native code loads DLLs when loaded. More precisely, on Windows Dynamic Link Library (DLL) files are loaded and on Unix-like systems shared library (SO) files are loaded.

For example, when a fresh R session is started we have the following DLLs:

$ R --vanilla
> dll0 <- getLoadedDLLs()
> dll0
                                                Filename Dynamic.Lookup
base                                                base          FALSE
methods       /usr/lib/R/library/methods/libs/methods.so          FALSE
utils             /usr/lib/R/library/utils/libs/utils.so          FALSE
grDevices /usr/lib/R/library/grDevices/libs/grDevices.so          FALSE
graphics    /usr/lib/R/library/graphics/libs/graphics.so          FALSE
stats             /usr/lib/R/library/stats/libs/stats.so          FALSE

When loading a package with native code, it will add another entry, e.g.

> library("matrixStats")
> getLoadedDLLs()
                                                                              Filename Dynamic.Lookup
base                                                                              base          FALSE
methods                                     /usr/lib/R/library/methods/libs/methods.so          FALSE
utils                                           /usr/lib/R/library/utils/libs/utils.so          FALSE
grDevices                               /usr/lib/R/library/grDevices/libs/grDevices.so          FALSE
graphics                                  /usr/lib/R/library/graphics/libs/graphics.so          FALSE
stats                                           /usr/lib/R/library/stats/libs/stats.so          FALSE
matrixStats /home/hb/R/x86_64-pc-linux-gnu-library/3.3/matrixStats/libs/matrixStats.so           TRUE

When unloading a package that registers a DLL it (ideally) not only unloads the package but also unregister its DLL, e.g.

> unloadNamespace("matrixStats")
> getLoadedDLLs()
                                                Filename Dynamic.Lookup
base                                                base          FALSE
methods       /usr/lib/R/library/methods/libs/methods.so          FALSE
utils             /usr/lib/R/library/utils/libs/utils.so          FALSE
tools             /usr/lib/R/library/tools/libs/tools.so          FALSE
internet                 /usr/lib/R/modules//internet.so           TRUE
grDevices /usr/lib/R/library/grDevices/libs/grDevices.so          FALSE
graphics    /usr/lib/R/library/graphics/libs/graphics.so          FALSE
stats             /usr/lib/R/library/stats/libs/stats.so          FALSE

A package can unload its registered DLLs using:

.onUnload <- function(libpath) {
    gc()
    library.dynam.unload(utils::packageName(), libpath)
 }

Forcing the garbage collector to run (gc()) will trigger finalizer functions to be called of which some may need the DLL to run.

Issue

It turns out that several packages forget to unregister their DLLs when unloaded. For example,

> library("digest")
> unloadNamespace("digest")
> getLoadedDLLs()
                                                                  Filename Dynamic.Lookup
base                                                                  base          FALSE
methods                         /usr/lib/R/library/methods/libs/methods.so          FALSE
utils                               /usr/lib/R/library/utils/libs/utils.so          FALSE
tools                               /usr/lib/R/library/tools/libs/tools.so          FALSE
internet                                   /usr/lib/R/modules//internet.so           TRUE
grDevices                   /usr/lib/R/library/grDevices/libs/grDevices.so          FALSE
graphics                      /usr/lib/R/library/graphics/libs/graphics.so          FALSE
stats                               /usr/lib/R/library/stats/libs/stats.so          FALSE
digest    /home/hb/R/x86_64-pc-linux-gnu-library/3.3/digest/libs/digest.so           TRUE

(UPDATE: The digest package has since fixed this, but the example still applies to many other packages).

The problem with packages not unregistering their DLLs when unloaded is that it risks to eventually fill up R's internal DLL registry which can only hold MAX_NUM_DLLS (== 100). When this happens, R will fail to load any packages that needs to register a DLL with the following error message:

`maximal number of DLLs reached...

This is guaranteed to happen if one tries to load and unload all CRAN packages one by one, e.g.

for (pkg in CRANpkgs) {
  loadNamespace(pkg)
  unloadNamespace(pkg)
}

There have been several reports on hitting this limit, e.g.

Suggestion / Wish

R CMD check assertion

Have R CMD check also asserts that the package also unloads any registered DLLs, e.g.

* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... WARNING
  Unloading the namespace does not unload DLL
* checking loading without being on the library search path ... OK

unloadNamespace()

Assert / warn

Maybe unloadNamespace() should check for left-over DLLs and give a warning whenever coupled DLLs are not unloaded.

Concerns

Karl Miller wrote on 2016-12-20 (https://stat.ethz.ch/pipermail/r-devel/2016-December/073528.html):
"It's not always clear when it's safe to remove the DLL."

UPDATE 2016-12-20: Add recommendation to run gc() before removing DLL when unloading a package. See thread https://stat.ethz.ch/pipermail/r-devel/2016-December/073522.html.

@HenrikBengtsson
Copy link
Owner Author

HenrikBengtsson commented Jul 12, 2016

Related: Next release of R.utils 2.4.0 (on CRAN) provides will provide:

  • strayDLLs(): identifies stray DLLs
  • gcDLLs(): identifies and unloads stray DLLs

@HenrikBengtsson
Copy link
Owner Author

HenrikBengtsson commented Jan 15, 2017

Related

B Ripley just submitted the following "note to self comments" to src/main/Rdynload.c:

 +/* Note that it is likely that dlopen will use up at least one file
 +   descriptor for each DLL loaded (it may load further dynamically
 +   linked libraries), so we do not want to get close to the fd limit
 +   (which may be as low as 256). */
 #define MAX_NUM_DLLS	100

So, apparently, there might be downstream issues if MAX_NUM_DLLS is just increased (as several are requesting), although it doesn't look to serious of a problem.

Existing limits set by the OS

Here dlopen refers to a native function: "function dlopen() loads the dynamic shared object (shared library) file named by the null-terminated string filename and returns an opaque "handle" for the loaded object". The fd limit refers to "the maximum number of open files / file descriptors (FD)". The limit is specific to each system. On Ubuntu 16.04 one can find the limit as:

$ cat /proc/sys/fs/file-max
1613668

and "hard and soft value" for a user (which I don't know what they are):

$ ulimit -Hn
65536

$ ulimit -Sn
1024

(couldn't R query these limits?)

The above are the defaults on my OS setup. Apparently, one can increase this limit, e.g. https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/.

@HenrikBengtsson
Copy link
Owner Author

HenrikBengtsson commented Jan 26, 2017

Related

As of 2017-01-26, in R devel (>= 3.4.0), the DLL limit is now effectively increased to max(0.6*fd_limit, 1000), cf. wch/r-source@3b49af7 :)

The NEWS entry for R-devel (to become 3.5.0) is:

  • The maximum number of DLLs that can be loaded into R e.g. via dyn.load() has been increased up to 614 when the OS limit on the number of open files allows.

@HenrikBengtsson HenrikBengtsson changed the title WISH / ROBUSTNESS: R CMD check to assert that DLLs are unregistered when package is unloaded WISH: R CMD check to assert that DLLs are unregistered when package is unloaded [SOLVED in R (>= 3.5.0)] Feb 13, 2018
@HenrikBengtsson HenrikBengtsson changed the title WISH: R CMD check to assert that DLLs are unregistered when package is unloaded [SOLVED in R (>= 3.5.0)] WISH: R CMD check to assert that DLLs are unregistered when package is unloaded Feb 23, 2018
@HenrikBengtsson HenrikBengtsson added the on r-devel or r-pkg-devel mailing lists Issue has been raised on the R-devel or R-pkg-devel mailing lists label Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annoyance enhancement on r-devel or r-pkg-devel mailing lists Issue has been raised on the R-devel or R-pkg-devel mailing lists package-building
Projects
None yet
Development

No branches or pull requests

1 participant