Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

having datetime-strings in input parameters will produce errors #75

Open
OssiLehtinen opened this issue Jan 31, 2020 · 8 comments
Open
Labels
bug Something isn't working

Comments

@OssiLehtinen
Copy link

Issue Description

A query such as:

tbl(con, "table) %>% 
filter(date_col == "2020-01-31 51:59:25")

will produce an error, as Athena does not understand a literal of the form date '2020-01-31 51:59:25', but R can cast it to a date even with 'as.Date(x, tryFormats = "%Y-%m-%d")' used for testing in sql_escape_string.AthenaConnection.

This can be fixed with;

# Athena specifc S3 method for converting date variables and iso formateed date strings to date literals
#' @rdname sql_translate_env
#' @export
sql_escape_string.AthenaConnection <- function(con, x) {
  all_dates <- all(try(as.Date(x, tryFormats = "%Y-%m-%d"), silent=T) == x) & all(nchar(x) == 10)
  if(all_dates & !is.na(all_dates)) {
    paste0('DATE ', DBI::dbQuoteString(con, x))
  } else {
    DBI::dbQuoteString(con, x)
  }
}

However, it wouldn't be a bad idea to have similar handling of date times and format those to, e.g., "datetime'2020-01-31 51:59:25'".

I will look into it, but it would be good to at least have the fix above included before pushing to cran.

@DyfanJones DyfanJones added the bug Something isn't working label Jan 31, 2020
@DyfanJones
Copy link
Owner

Are you sure datetime'2020-01-31 51:59:25' works from what I have read it needs to be in TIMESTAMP '2001-08-22 03:04:05.321' data types

@OssiLehtinen
Copy link
Author

Ah, sorry, you are right: it should be 'timestamp' and not 'datetime'. The fractional seconds are optional, however, but can have at maximum three digits.

Timestamps of the format '2020-01-31T11:59:25Z' do not work, however.

@DyfanJones
Copy link
Owner

This is true, however I think this bug fix should be sufficient enough for what it is try to achieve.

@OssiLehtinen
Copy link
Author

Yes!

If one wants similar handling of timestamps, there is an additional hurdle:

dbplyr formats timestamp variables in the above mentioned format (e.g., '2020-01-31T11:59:25Z', https://github.com/tidyverse/dbplyr/blob/38cdafd74303baaee02e545831d242de6c656f4c/R/escape.R#L76) so one would need to either be able to write an athena specific version of escape.POSIXt (that is, dbplyr should export the generic or a sql_escape_POSIXt etc) or alternatively one could just drop the T's and Z's and what not before passing the string on, but then one will surely get in trouble with timezones.

So the dates seem to be the low hanging fruit and timestamps spell trouble (as usual)...

days-since-last-timezone-issue-1-days-since-last-timezone-66030538

@DyfanJones
Copy link
Owner

PR #72 now includes this bug fix. Will leave issue open

@OssiLehtinen
Copy link
Author

FYI:
the development version of dbplyr has now generics for sql_escape_date and sql_escape_datetime, which allow, for example, Athena-specific handling of date-variables as inputs. Also simple AthenaConnection-methods exist there now.

tidyverse/dbplyr#391

If I am not missing something, this should not affect the functionality of RAthena of noctua. For the case of real date variables (as opposed to date strings such as "2020-01-01"), dbplyr shoud add the 'date' keyword out of the box and the "is this string actually a date" -deduction will not be needed nor done for those cases.

For the date strings, the implemented date test is still required. That is something Hadley didn't want to implement in dbplyr as the default behaviour. However, in my opinion, that feature is super useful.

Do you see some issues with these changes?

@DyfanJones
Copy link
Owner

DyfanJones commented Apr 21, 2020

Ah perfect. it looks like you have done alot of work with dbplyr to make it simpler. If I have understood this correctly the sql_escape_string can be removed or atleast only implemented for early versions of dbplyr? If this is the case I will wrap a if statement around if for the dbplyr version :)

@DyfanJones
Copy link
Owner

@OssiLehtinen just had a deeper look at the new changes in dbplyr and it looks like no change is require for RAthena or noctua as you said earlier (sorry just waking up and that :D ). All I will do is align the syntax and use lower case instead of upper :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants