Skip to content
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.

supported dtypes #24

Open
jreback opened this issue Sep 15, 2016 · 11 comments
Open

supported dtypes #24

jreback opened this issue Sep 15, 2016 · 11 comments
Labels

Comments

@jreback
Copy link

jreback commented Sep 15, 2016

Obvious / currently supported see here
xref #20

  • integer
  • unsigned integer
  • float
  • complex
  • boolean
  • datetime (ns)
  • datetime w/tz (ns)
  • timedelta (ns)
  • period
  • category
  • var length string

Informational, may want to think about the desirability of adding later

non-support (try to raise informative errors & point to ready-made solns)

  • void (a variation on object)
  • non-supported combinations (e.g. arbitrary dtypes, though maybe a pre-defined union)
@max-sixty
Copy link

  • different datetime precision & ranges (e.g. ms vs ns)

@chris-b1
Copy link

chris-b1 commented Sep 15, 2016

Under possible

  • date with no time type (edit: on second thought, is this anything more than a Period[D]?)

@chrisaycock
Copy link

Some of the more extreme:

  • IP addresses (both v4 and v6)
  • fractions, storing numerator and denominator as integers

@datnamer
Copy link

datnamer commented Sep 21, 2016

@jreback wouldn't we just want a way to have user defined dtypes instead of hardcoding a limited list? Can Dynd help with this?

@jreback
Copy link
Author

jreback commented Sep 21, 2016

you certainly can have parameterized types. but completely generic types is a recipe for disaster.
what do you think is missing for primitive / logical typing?

@datnamer
Copy link

datnamer commented Sep 21, 2016

What do you mean by parameterized? What types can be parameterized and by what? The link is broken.

Sorry I'm a bit lost.

I'm thinking of having a column of distribution objects or linear models or agents with their own attributes.

@jreback
Copy link
Author

jreback commented Sep 21, 2016

that's much too high level - though potential for a another library to build on pandas type system is possible

we are taking about columns of primitives

paramterized are things like

datetime64[D]

@datnamer
Copy link

gotcha.

@wesm
Copy link
Owner

wesm commented Sep 22, 2016

@datnamer either way, pandas needs to have its own metadata implementation (see the logical/physical decoupling discussion in https://pydata.github.io/pandas-design/internal-architecture.html#logical-types-and-physical-storage-decoupling). We do not want to delegate metadata details to a third party library. Data structures and computation are another matter on a case by case basis (i.e. assuming a library conforms to our memory representation expectations, we can use its algorithms). The tight coupling between metadata (numpy dtypes), memory representation, and algorithms/computation is part of why we are in the current mess.

@jreback jreback added the dtypes label Sep 30, 2016
@jreback
Copy link
Author

jreback commented Oct 5, 2016

maybe thing about this: pandas-dev/pandas#3443, which is about nested dtypes in a single object. On another vein should think about a union type (which is a essentially a restricted looking object dtype); SFrame has these.

@sinhrks
Copy link

sinhrks commented Oct 6, 2016

+1 for sparse.

maybe including subtype in sparse and categorical is useful, like category[int64]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants