Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in detectDates = TRUE #249

Closed
JKigotho opened this issue Jan 28, 2017 · 7 comments
Closed

Error in detectDates = TRUE #249

JKigotho opened this issue Jan 28, 2017 · 7 comments

Comments

@JKigotho
Copy link

Hi Alex,

The detectDates argument is not able to read dates in dd/mm
dat.xlsx
/yyyy h:mm AM/PM format (e.g 23/08/2012 15:09:05). Hence when the argument is set to detectDates = TRUE when using read.xlsx in an excel file which has dates in the above format, I get the below error:

Error in read.xlsx.default(xlsxFile = "Customer Open Transaction Listing With Cut Off.xlsx", :
basic_string::substr: __pos (which is 8) > this->size() (which is 5)

This is new to 4.0.0. I've also tried on the development version as suggested in issue #242 but got the same error.

See the last column in the attached file. Without the column, the file loads successfully.

R version 3.3.2
Openxlsx version 4.0.0

Regards,

Joseph.

@awalker89
Copy link
Owner

I can't reproduce the error you are seeing. As a work around you can try

x <- read.xlsx("dat.xlsx")
x$CREATED.DATE.AND.TIME <- convertToDateTime(x$CREATED.DATE.AND.TIME)
x

I'll continue looking into this.

@Munikumarnm
Copy link

Hi Alex,
Even i got the same error when i tried to use Detect Dates.
Error in read.xlsx.default("ConcurIndirectSpend.xlsx", detectDates = T) :
basic_string::substr: __pos (which is 8) > this->size() (which is 7)

Regards
Munikumar

@neros
Copy link

neros commented Jul 24, 2020

Confirming error exists as of July 2020 in version 4.1.5
When detect Dates is TRUE Throwing error (not thrown when FALSE):
Error in read_workbook(cols_in = cell_cols, rows_in = cell_rows, v = v, :
basic_string::substr: __pos (which is 8) > this->size() (which is 1)

@aushev
Copy link

aushev commented Mar 21, 2021

I confirm having the same error. Here's how to reproduce it:

  1. Create blank spreadsheet in Excel
  2. In cell A1, enter some text
  3. In cells A2 and A3, enter some date (1/1/2021 for example), check that it is recognized as date
  4. In Format -> Number -> Custom, choose mm/dd/yyyy for A2 and mm"-"dd"-"yyyy for A3
  5. Save as xlsx file
  6. Open as openxlsx::read.xlsx('test.xlsx', detectDates=TRUE)

Error in read_workbook(cols_in = cell_cols, rows_in = cell_rows, v = v, :
basic_string::substr: __pos (which is 8) > this->size() (which is 5)

@neros
Copy link

neros commented Mar 29, 2021

Loading the workbook first and then passing the workbook to read.xlsx somewhat resolves this error. (A warning message is received instead, but the read happens and valid dates appear to be read in). For what it's worth I discovered this in the course of tracking down another issue with a script taking nearly 3 minutes to read-in a 200line Excel sheet -- evidently the sheet author had highlighted a number of full-length columns with a background color and (I presume, a guess) read.xlsx was reading in the million+ max rows in the sheet, even though they were blank. In any case, loading the workbook first also resolved this issue.

And the warning message received was:
Warning message:
In as.Date(as.integer(v[isDate]) - origin, origin = "1970-01-01") :
NAs introduced by coercion

Will read the file and throw a warning:
wb <- openxlsx::loadWorkbook("c:/test.xlsx")
x<-openxlsx::read.xlsx(xlsxFile=wb,sheet="name,colNames=FALSE, skipEmptyRows=FALSE,skipEmptyCols=FALSE,detectDates=TRUE)

Will throw an error and fail to read when encountering certain date formats:
x<-openxlsx::read.xlsx(xlsxFile=""c:/test.xlsx",sheet="name",colNames=FALSE, skipEmptyRows=FALSE,skipEmptyCols=FALSE,detectDates=TRUE)

@juniperlsimonis
Copy link

i encountered a similar problem with openxlsx_4.2.3 running in R version 4.0.5 (2021-03-31) in Windows 10
i can't share the data as is, and the dataset is massive, and i don't know the specific cause of the error, but

when i use detect dates, i get the same

Error in read_workbook(cols_in = cell_cols, rows_in = cell_rows, v = v,  : 
  basic_string::substr: __pos (which is 8) > this->size() (which is 4)

turning off detect dates, it reads in cleanly

just noting this for you, i'm ok with not using detect dates.

thanks for all of your work on this!

@aushev
Copy link

aushev commented Nov 6, 2021

I can confirm the issue is still there. See the file attached, opening it with detectDates=TRUE raises an error:

Error reading date:
44489.4
row: 1
col: 1
Error in read_workbook(cols_in = cell_cols, rows_in = cell_rows, v = v,  : 
  basic_string::substr: __pos (which is 8) > this->size() (which is 7)

bad.xlsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants