-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for StringDtype (available in Pandas >=1.0) #1237
Comments
We'll never support pure transparent pandas compatibility, so best bet is to convert to stable apis (like object) prior to trying to ingest. We will try to keep up with core developments, but our current pandas target is 0.22 so theres more likelihood of us pinning |
What is "pure transparent pandas compatibility"? Support for |
@texodus as an example, pivot deconstruction and reconstruction from pandas is mostly broken right now, and there are lots of scenarios where you can't just go from pandas into perspective and get the same results, e.g. in a pivot table you might not have the ability to unpivot and repivot. Also we should be careful to avoid features explicitly marked "unstable" whether or not they exist in a newer version. |
From the docs "StringDtype is considered experimental. The implementation and parts of the API may change without warning." |
Hey, I have been looking into this issue and have been able to reproduce it. The issue has been open for a while. I was wondering if anyone has come across any possible approach. I have tried the potential solution above and other variations of it ( dtype('str'), dtype(np.str_) ) and it seems to have resolved the error: but it generated a new error below: |
Add support for StringDtype (fixes #1237)
Feature Request
Allow for Pandas dataframe columns of "string" datatype - see https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html
Description of Problem:
Right now the following call of perspective's Table constructor fails:
with
PerspectiveError: Mixed datasets of numpy.ndarray and lists are not supported.
Potential Solutions:
Maybe just convert (internally) to dtype('o') for backward compatibility as the first step... At least
Mixed datasets of numpy.ndarray and lists are not supported.
needs to be replaced with something more meaningful as StringDType is neither list nor numpy array. It does also look that StringDType can be converted to arrow array, no problem (see https://github.com/pandas-dev/pandas/blob/v1.1.3/pandas/core/arrays/string_.py#L26-L97)The text was updated successfully, but these errors were encountered: