-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unexpected: read.xlsx alters random number generation sequence #183
Comments
That is expected with calls from and to c++ (in our case |
@JanMarvin thanks for the comment. |
During the lunch break I couldn't reproduce this behavior with a minimal example (might be, that the Rcpp guys changed things on their end; there was chatter in the Rcpp repo a while ago and now I could not reproduce this SO). Chances are, that their might be things on our end (the c++ code is included in a rather unusual way) interfering with the seed. But still, I'd say that's up to the user to figure and happens most likely with many different packages embedding c++ code. As a rule of thumb don't run interfering code. To answer your question: I think that the seed is altered by the c++ call. Therefore your random numbers are different. To my understanding, and I wouldn't call myself an expert on the topic, this is due to the call to the c++ functions and how they handle the |
This results in a change of seed.
I hope this explains the behaviour. Maybe this is a topic for R-Core. |
Good catch! Question is: do we want to preserve the seed? I'd still be in favor of leaving this to the user. set.seed(123)
x <- .Random.seed
ref <- rnorm(123)
identical(ref, rnorm(123)) # FALSE
.Random.seed <- x
identical(ref, rnorm(123)) # TRUE |
Leaving it to the user seems a good idea, which could be handled by the options of the package. Every function which creates a temporary directory or file would need a correction for the random seed. To check that we would need to create also loads of new tests to check for the seed too. |
I may be wrong, but the documentation for tempdir() states that is simply returns the per-session temporary directory which is created before the interpreter is started.
So the problem is not tempdir() per se, but the fact that a temporary directory or file is created by read.xlsx() which uses a random name and is not presumably using tempfile(), which is careful not to disturb the random seed, according to this experiment:
|
I guess the line in question might be this one? Lines 142 to 143 in 89ce583
|
Just a comment because we had a problem recently which was also caused by this code segment. We had multiple child processes reading xls files (via So anyhow thanks for the fix :) Using |
# openxlsx 4.2.7.1 * It's now possible to insert a hyperlinked image by passing a URL, relative or absolute file path, or mailto string to the new `address` parameter of `insertImage()`. # openxlsx 4.2.7 * Fixed warning on `dataValidation(..., type = "list")` ([#342](ycphs/openxlsx#342)) * Added optional argument to `loadWorkbook` to decide if empty/blank cells should be converted to NA_character_ (the default) or left blank as is * `saveWorkbook()` now succeeds when called after the user has set column widths for a range of columns (e.g. 1:2), saved the workbook, then set column widths for a new range that is inclusive of the previous one (e.g. 1:5) ([#493](ycphs/openxlsx#493)). ## Improvements * Improve detectDates ([#288](ycphs/openxlsx#288)) * Preserve window size and position, also `getWindowSize()` and `setWindowSize()` ([466](ycphs/openxlsx#466)) # openxlsx 4.2.6 * Fix external links ([#410](ycphs/openxlsx#410)) * Do not add unneccessary sheetPr node ([#409](ycphs/openxlsx#409)) * Add support for `namedRegion`s having dots and other special characters ([#338](ycphs/openxlsx#338)). * Add type blanks and not blanks to conditional formatting ([#311](ycphs/openxlsx#311)) # openxlsx 4.2.5 ## Fixes * `openxlsx_setOp()` now works with named list ([#215](ycphs/openxlsx#215)) * `loadWorkbook()` imports `inlineStr`. Values remain `inlineStr` when writing the workbook with `saveWorkbook()`. Similar `read.xlsx` and `readWorkbook` import `inlineStr`. * `read.xlsx()` no longer changes random seed ([#183](ycphs/openxlsx#183)) * fixed a regression that caused fonts to be read in incorrectly ([#207](ycphs/openxlsx#207)) * add option to save as read only recommended ([#201](ycphs/openxlsx#201)) * fixed writing hyperlink formulas ([#200](ycphs/openxlsx#200)) * `write.xlsx()` now throws an error if it doesn't have write permissions ([#190](ycphs/openxlsx#190)) * `write.xlsx()` now again uses the default of `overwrite = TRUE` for saving files ([#249](ycphs/openxlsx#249)) * `as.character.formula()` exported to warn about potential conflicts with other packages ([#312](ycphs/openxlsx#312), [#315](ycphs/openxlsx#315)) ## Improvements * `options()` are more consistently set in functions (see: [#289](ycphs/openxlsx#262)) * `Workbook$show()` no longer fails when called in a 0 sheet workbook([#240](ycphs/openxlsx#240)) * `read.xlsx()` again accepts `.xlsm` files ([#205](ycphs/openxlsx#205), [#209](ycphs/openxlsx#209)) * `makeHyperlinkString()` does no longer require a sheet argument ([#57](ycphs/openxlsx#57), [#58](ycphs/openxlsx#58)) * improvements in how `openxlsx` creates temporary directories (see [#262](ycphs/openxlsx#262)) * `writeData()` calls `force(x)` to evaluate the object before options are set ([#264](ycphs/openxlsx#264)) * `createComment()` now correctly handles `integers` in `width` and `height` ([#275](ycphs/openxlsx#275)) * `setStyles()` accepts `halign="justify"` ([#305](ycphs/openxlsx#305))
Expected Behavior
Loading an excel file via read.xlsx() does not affect the sequence of generated random number
Actual Behavior
Using read.xlsx() results in a change in the random number sequence.
Steps to Reproduce the Problem
(please attach an example xlsx file if possible)
The problem is independent of the specific excel file loaded, so no file is attached.
Interstingly, the issue does not appear with R 3.4.1 and openxlsx 4.0.17
The following transcript, run in a fresh R session, documents the behaviour:
sessionInfo()
The text was updated successfully, but these errors were encountered: