-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we have endpoint_override
as an attribute if not yet?
#169
Comments
Hi @aoyh, sounds like a good enhancement :) i will start working on an implementation tomorrow :) |
Thanks a lot for your prompt reply! With that enhancement, I don't have to switch to library of other language. Another 2 good reasons:
|
Hi @aoyh, I have done some initial work on this feature. Please have a go and let me know if it meets the requirements 😄 remotes::install_github("dyfanjones/rathena", ref="sdk-extra-parameters")
Note: So you should be able to do the following: library(RAthena)
library(DBI)
con <- dbConnect(RAthena::athena(),
profile_name = "rathena",
work_group = 'mygroup',
endpoint_url = 'url.aws.com') |
@DyfanJones So quick! Thanks. Will check and let you know. |
Hmmm I think my initial implementation won't work as expected 🤔 I forgot each AWS service uses it's own endpoint. Currently It looks like JDBC driver just overrides the Athena endpoint:
This is a similar issue when using from pyathena import connect
from pyathena.pandas.cursor import PandasCursor
cursor = connect(
profile_name = "default",
s3_staging_dir="s3://made-up",
endpoint_url = "https://athena.eu-west-1.amazonaws.com",
cursor_class=PandasCursor).cursor()
df = cursor.execute("select * from sampledb.elb_logs limit 10").as_pandas()
To resolve this we could give options for to user in which AWS service do they want to override for example endpoints = list(Athena = "my.amazing.endpoint") Possibly allowing strings that only affect Athena's endpoint 🤔 |
@aoyh Second attempt :P In this implementation you can override each aws service endpoint. To do so you will need to provide a named list of the services' endpoint you want to override. Also if you provide your endpoint only (as a character) then aws athena endpoint will be overridden. I think this gives alot of flexiblity to users when wanting to use custom endpoints :) Please have a go and let me know if this meets requirement. P.s. does the new documentation make sense? If not please let me know 😄 library(DBI)
con1 = dbConnect(
RAthena::athena(),
endpoint_override = "https://athena.eu-west-1.amazonaws.com"
)
dbGetInfo(con1)
#> $endpoint_override
#> $endpoint_override$athena
#> [1] "https://athena.eu-west-1.amazonaws.com"
#>
#>
#> $region_name
#> [1] "eu-west-1"
#>
#> $keyboard_interrupt
#> [1] TRUE
#>
#> $timezone
#> [1] "UTC"
#>
#> $expiration
#> NULL
#>
#> $kms_key
#> NULL
#>
#> $encryption_option
#> NULL
#>
#> $poll_interval
#> NULL
#>
#> $work_group
#> [1] "primary"
#>
#> $dbms.name
#> [1] "default"
#>
#> $s3_staging
#> [1] "s3://dummy"
#>
#> $profile_name
#> NULL
#>
#> $boto3
#> [1] "1.21.35"
#>
#> $RAthena
#> [1] "2.5.1.9000"
con2 = dbConnect(
RAthena::athena(),
endpoint_override = list(athena = "https://athena.eu-west-1.amazonaws.com")
)
dbGetInfo(con2)
#> $endpoint_override
#> $endpoint_override$athena
#> [1] "https://athena.eu-west-1.amazonaws.com"
#>
#>
#> $region_name
#> [1] "eu-west-1"
#>
#> $keyboard_interrupt
#> [1] TRUE
#>
#> $timezone
#> [1] "UTC"
#>
#> $expiration
#> NULL
#>
#> $kms_key
#> NULL
#>
#> $encryption_option
#> NULL
#>
#> $poll_interval
#> NULL
#>
#> $work_group
#> [1] "primary"
#>
#> $dbms.name
#> [1] "default"
#>
#> $s3_staging
#> [1] "s3://dummy"
#>
#> $profile_name
#> NULL
#>
#> $boto3
#> [1] "1.21.35"
#>
#> $RAthena
#> [1] "2.5.1.9000"
con3 = dbConnect(
RAthena::athena(),
endpoint_override = list(
Athena = "https://athena.eu-west-1.amazonaws.com",
s3 = "https://s3.eu-west-1.amazonaws.com"
)
)
dbGetInfo(con3)
#> $endpoint_override
#> $endpoint_override$athena
#> [1] "https://athena.eu-west-1.amazonaws.com"
#>
#> $endpoint_override$s3
#> [1] "https://s3.eu-west-1.amazonaws.com"
#>
#>
#> $region_name
#> [1] "eu-west-1"
#>
#> $keyboard_interrupt
#> [1] TRUE
#>
#> $timezone
#> [1] "UTC"
#>
#> $expiration
#> NULL
#>
#> $kms_key
#> NULL
#>
#> $encryption_option
#> NULL
#>
#> $poll_interval
#> NULL
#>
#> $work_group
#> [1] "primary"
#>
#> $dbms.name
#> [1] "default"
#>
#> $s3_staging
#> [1] "s3://dummy"
#>
#> $profile_name
#> NULL
#>
#> $boto3
#> [1] "1.21.35"
#>
#> $RAthena
#> [1] "2.5.1.9000" Created on 2022-04-20 by the reprex package (v2.0.1) |
@DyfanJones Thanks for the detailed explanation! Since my endpoint looks like
Any clue? |
It looks like your IAM role doesn't have permission to StartQueryExecution on AWS Athena using that endpoint. Can you double check your IAM roles to ensure you are able to :) My guess is you don't have permission to run AWS Athena in the
The S3 and Glue services should be ok as you haven't overridden their endpoints :) |
Additional info:
I use R a lot more, thus don't mind spending more time figuring it out how to make RAthena work here. Many thanks! |
Sorry I might of missed somehting. Are you getting the error in |
Yes, I get the error in |
Have you tried using |
@aoyh did it work for pyathena? or did you get a similar error to RAthena? |
@aoyh any update on this? |
Hi @DyfanJones Sorry for my late reply. I spent a few days trying to get
However, another python package pyathenajdbc seems to be working there.
Note that the Thanks! |
Thanks for doing that investigation @aoyh. From the looks of it As |
No worries @DyfanJones . I will use a workaround by starting with Thank you all the same! |
@aoyh there is an R package that uses the jdbc driver |
Thanks @DyfanJones for timely tip. I was just pondering on the idea that a similar R package may exist. And you just lighten it! |
I will do an initial release of the endpoint_override feature. In the meantime i will have to open up the athena jdbc to see what is the difference. From my understanding i am passing the endpoint correctly however I could be mistaken. |
@aoyh I have done some tweaking to the implementation of the # Enable repository from dyfanjones
options(repos = c(
ropensci = 'https://dyfanjones.r-universe.dev',
CRAN = "https://cloud.r-project.org"))
# Download and install RAthena in R
install.packages('RAthena') Many thanks for the testing you have done for me so far |
General notes for completeness .... It looks like Here is the method @apply_configs
def client(
service_name: str,
session: Optional[boto3.Session] = None,
botocore_config: Optional[botocore.config.Config] = None,
verify: Optional[Union[str, bool]] = None,
) -> boto3.client:
"""Create a valid boto3.client."""
endpoint_url: Optional[str] = _get_endpoint_url(service_name=service_name)
return ensure_session(session=session).client(
service_name=service_name,
endpoint_url=endpoint_url,
use_ssl=True,
verify=verify,
config=default_botocore_config() if botocore_config is None else botocore_config,
) Note: The endpoint needs to be in the samge region i.e. region: eu-west-1, and endpoint: https://athena.eu-west-1.amazonaws.com/ . If the region doesn't match the region the you are at risk of the following aws error: |
RAthena v-2.6.0 has been released on the cran. Let me know if your still having the endpoint_override issue. |
Create an attribute named
endpoint_override
in dbConnectExample:
The text was updated successfully, but these errors were encountered: