Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google spreadsheets #5915

Merged
merged 7 commits into from
Dec 10, 2018
Merged

Google spreadsheets #5915

merged 7 commits into from
Dec 10, 2018

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Sep 17, 2018

This PR allow querying Google spreadsheets via SQL Alchemy, using a new module called gsheetsdb.

gsheets

It works, but requires pushing some fixes upstream to moz-sql-parser.

SQL Lab also works:

screen shot 2018-09-16 at 10 24 24 pm

@betodealmeida
Copy link
Member Author

Loading the tables in SQL Lab:

sqllab_gsheet

this.setState({ tableName });
this.props.actions.queryEditorSetSchema(this.props.queryEditor, schemaName);
this.fetchTables(this.props.queryEditor.dbId, schemaName);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code allows switching schema by typing new_schema.some_table in the table selector. The code breaks when the table name has a period in it, since it does naive splitting. I opted for removing it here, since it seems a bit unintuitive as well.

@@ -159,11 +159,12 @@ class ExploreResultsButton extends React.PureComponent {
</div>);
}
render() {
const allowsSubquery = this.props.database && this.props.database.allows_subquery;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: deconstruct props on the line before then you can just use allows_subquery in disabled param

const { 
    database: {
           allows_subquery
} } = this.props

https://github.com/lyft/marketdx/blob/master/src/components/Diagnosis/index.jsx#L55

@@ -2223,10 +2223,11 @@ def sqllab_viz(self):
}))

@has_access
@expose('/table/<database_id>/<table_name>/<schema>/')
@expose('/table/<database_id>/<path:table_name>/<schema>/')
@log_this
def table(self, database_id, table_name, schema):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the path doing for us here? is it a reference the param in the URI?

}

@classmethod
def select_star(cls, my_db, table_name, engine, schema=None, limit=100,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the limit var as a class var at the top as DEFAULT_LIMIT = 100

@betodealmeida
Copy link
Member Author

Time series now work:

screen shot 2018-10-30 at 12 08 57 pm

I also tested against a spreadsheet behind authentication, and it works.

@betodealmeida betodealmeida changed the title [WIP] Google spreadsheets Google spreadsheets Oct 30, 2018
@kristw kristw added the enhancement:request Enhancement request submitted by anyone from the community label Dec 3, 2018
@codecov-io
Copy link

codecov-io commented Dec 7, 2018

Codecov Report

Merging #5915 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5915      +/-   ##
==========================================
+ Coverage   73.31%   73.33%   +0.01%     
==========================================
  Files          67       67              
  Lines        9594     9600       +6     
==========================================
+ Hits         7034     7040       +6     
  Misses       2560     2560
Impacted Files Coverage Δ
superset/db_engine_specs.py 55.42% <100%> (+0.25%) ⬆️
superset/views/core.py 75% <100%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc8acf2...6be792d. Read the comment docs.

@mistercrunch
Copy link
Member

LGTM

@betodealmeida betodealmeida merged commit f366bbe into apache:master Dec 10, 2018
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Dec 10, 2018
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Dec 18, 2018
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Dec 18, 2018
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 17, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 24, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
xtinec pushed a commit to lyft/incubator-superset that referenced this pull request Jan 28, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 30, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
betodealmeida added a commit to lyft/incubator-superset that referenced this pull request Jan 31, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
xtinec pushed a commit to lyft/incubator-superset that referenced this pull request Feb 2, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
xtinec pushed a commit to lyft/incubator-superset that referenced this pull request Feb 4, 2019
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes

(cherry picked from commit f366bbe)
(cherry picked from commit 8900695)
@neildunlop
Copy link

Can anyone point to an example of setting up the datasource? I can setup the connection and test it fine, but once I save it raises an error.

@hemantaggarwal
Copy link

hemantaggarwal commented Dec 25, 2019

Can anyone point to an example of setting up the datasource? I can setup the connection and test it fine, but once I save it raises an error.

@neildunlop : Are you able to setup organization google sheet as datasource?
@betodealmeida : I am able to link public sheets but how to do it for organization specific sheets in superset?

@cachafla
Copy link

@betodealmeida ping. I'd like some help setting this up as well. I tried setting up a datasource with a URL such as:

gsheets://docs.google.com/spreadsheets/d/<spreadsheet-id>/edit#gid=0

But I get the following error:

2020-01-14 22:51:53,769:ERROR:root:'from'
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/superset/views/core.py", line 1705, in testconn
    conn.scalar(select([1]))
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 920, in scalar
    return self.execute(object_, *multiparams, **params).scalar()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1468, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/db.py", line 36, in g
    return f(self, *args, **kwargs)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/db.py", line 137, in execute
    query, headers, self.credentials)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/query.py", line 96, in execute
    from_ = extract_url(query)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/url.py", line 51, in extract_url
    return parse_sql(sql)['from']
KeyError: 'from'

I haven't found any documentation that shows how to setup GSheets a data source, even though this PR shows it's possible.

@hemantaggarwal
Copy link

@betodealmeida ping. I'd like some help setting this up as well. I tried setting up a datasource with a URL such as:

gsheets://docs.google.com/spreadsheets/d/<spreadsheet-id>/edit#gid=0

But I get the following error:

2020-01-14 22:51:53,769:ERROR:root:'from'
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/superset/views/core.py", line 1705, in testconn
    conn.scalar(select([1]))
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 920, in scalar
    return self.execute(object_, *multiparams, **params).scalar()
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1468, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/db.py", line 36, in g
    return f(self, *args, **kwargs)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/db.py", line 137, in execute
    query, headers, self.credentials)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/query.py", line 96, in execute
    from_ = extract_url(query)
  File "/home/superset/.local/lib/python3.6/site-packages/gsheetsdb/url.py", line 51, in extract_url
    return parse_sql(sql)['from']
KeyError: 'from'

I haven't found any documentation that shows how to setup GSheets a data source, even though this PR shows it's possible.

In data source put only "gsheets://" instead of full link. If you test it may give error but just save the data source and query the google sheet in SQL editor using full sheet link. But sheet should be public.

@ispulkit
Copy link

Can you explain a bit about the workaround for private and org-based gSheets?

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels enhancement:request Enhancement request submitted by anyone from the community 🚢 0.34.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants