-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: use shillelagh instead of gsheetsdb #13185
Conversation
Codecov Report
@@ Coverage Diff @@
## master #13185 +/- ##
===========================================
+ Coverage 53.06% 66.85% +13.79%
===========================================
Files 489 492 +3
Lines 17314 29051 +11737
Branches 4482 0 -4482
===========================================
+ Hits 9187 19421 +10234
- Misses 8127 9630 +1503
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
096ba14
to
45550d5
Compare
While fixing unit tests I did some clean up and removed duplicated code and duplicated exceptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit - this is a great step forward for as the gsheets
connector seems to be fairly widely used. Also big thanks for cleaning up the related code 👍
setup.py
Outdated
@@ -129,7 +129,7 @@ def get_git_sha(): | |||
"elasticsearch": ["elasticsearch-dbapi>=0.2.0, <0.3.0"], | |||
"exasol": ["sqlalchemy-exasol>=2.1.0, <2.2"], | |||
"excel": ["xlrd>=1.2.0, <1.3"], | |||
"gsheets": ["gsheetsdb>=0.1.9"], | |||
"gsheets": ["shillelagh>=0.2"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add , <0.3
here, just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -1234,7 +1231,7 @@ def testconn( # pylint: disable=too-many-return-statements,no-self-use | |||
uri = request.json.get("uri") | |||
try: | |||
if app.config["PREVENT_UNSAFE_DB_CONNECTIONS"]: | |||
check_sqlalchemy_uri(uri) | |||
check_sqlalchemy_uri(make_url(uri)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we weren't getting mypy errors here before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too.
SUMMARY
shillelagh is a Python module that provides a SQLAlchemy dialect and DB API 2.0 implementation to a variety of non-SQL resources. For example, it can be used to query CSV files or fetch weather data from https://www.weatherapi.com/ via SQL. The library has 100% unit test coverage and uses SQLite virtual tables to allow efficiently accessing the resources as if they were tables in a database.
The current module that we use to access Google Spreadsheets is gsheetsdb, which depends on the unmaintained moz-sql-parser. In addition to the dependency,
gsheetsdb
implements a code transpiler, and uses pre and post-processors to fetch data from the spreadsheet, making the code somewhat brittle.This week I implemented a Google Spreadsheets adapter in shillelagh, aiming to deprecate the
gsheetsdb
module. The implementation is a drop-in replacement, and users should be able to uninstall thegsheetsdb
module and install theshillelagh
module to have existing queries, datasets and charts continue working.This works because
shillelagh
registers a SQLAlchemy dialect called "gsheets". This way SQLAlchemy will useshillelagh
for all engines with the URL "gsheets://". This is an in-memory only backend, without access to the filesystem, so it's safe to use.On the other hand,
shillelagh
also registers its own dialect (aptly named "shillelagh", with an alias called "shillelagh+apsw"). That dialect is unsafe, as a malicious user could use it to read data from the filesystem by enabling the CSV adapter. For security reasons, I disabled the "shillelagh" dialect unlessPREVENT_UNSAFE_DB_CONNECTIONS
is on.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TEST PLAN
Tested SQL Editor and charts reading from Google spreadsheets, everything works.
ADDITIONAL INFORMATION