-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add functions for arithmetic, rounding, logarithmic, and string transformations #230
Changes from all commits
41fb45a
e87a152
5bcf8b1
31d8d72
8735b4e
0042b79
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -153,6 +153,36 @@ scalar_functions: | |
- value: fp64 | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "negate" | ||
description: "Negation of the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i8 | ||
return: i8 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i16 | ||
return: i16 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i32 | ||
return: i32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i64 | ||
return: i64 | ||
- args: | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "modulus" | ||
description: "Get the remainder when dividing one value by another." | ||
|
@@ -173,6 +203,68 @@ scalar_functions: | |
- value: i64 | ||
- value: i64 | ||
return: i64 | ||
- | ||
name: "power" | ||
description: "Take the power with the first value as the base and second as exponent" | ||
impls: | ||
- args: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where did you come up with these signatures from? For example, why does i8^i8 return i8? It seems like we'll run out of space very quickly. I feel like many systems support something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe @sanjibansg was basing these on Arrow's implementation. Arrow's implementation currently does not promote:
Postgres and MySQL promote integers to decimal. SQL Server does not promote (and raises an overflow error) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jacques-n do you have a strong preference which direction we go here? Should we have multiple |
||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i8 | ||
- value: i8 | ||
return: i8 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i16 | ||
- value: i16 | ||
return: i16 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i32 | ||
- value: i32 | ||
return: i32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i64 | ||
- value: i64 | ||
return: i64 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "sqrt" | ||
description: "Square root of the value" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if the value is negative? |
||
impls: | ||
- args: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a particular reason that smaller numbers use fp32 as an output? It seems like maybe they should all be fp64. Are using a particular system's definition here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. Arrow also upcasts all to fp64. |
||
- value: i8 | ||
return: fp64 | ||
- args: | ||
- value: i16 | ||
return: fp64 | ||
- args: | ||
- value: i32 | ||
return: fp64 | ||
- args: | ||
- value: i64 | ||
return: fp64 | ||
- args: | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- value: fp64 | ||
return: fp64 | ||
aggregate_functions: | ||
- name: "sum" | ||
description: Sum a set of values. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
%YAML 1.2 | ||
--- | ||
scalar_functions: | ||
- | ||
name: "ln" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to document what these options apply to (and their meaning) somewhere. In arrow, arithmetic functions overflow by default. However, logrithmic functions go to |
||
description: "Natural logarithm of the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "log10" | ||
description: "Logarithm to base 10 of the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "log2" | ||
description: "Logarithm to base 2 of the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
return: fp64 | ||
- | ||
name: "logb" | ||
description: "Logarithm of the value with the given base" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
- value: fp64 | ||
return: fp64 | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
%YAML 1.2 | ||
--- | ||
scalar_functions: | ||
- | ||
name: "ceil" | ||
description: "Rounding to the ceiling of the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: i64 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
return: i64 | ||
- | ||
name: "floor" | ||
description: "Rounding to the floor of the value" | ||
impls: | ||
- args: | ||
required: false | ||
- value: fp32 | ||
return: i32 | ||
- args: | ||
required: false | ||
- value: fp64 | ||
return: i64 | ||
- | ||
name: "round" | ||
description: "Rounding off the value" | ||
impls: | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i8 | ||
return: i8 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above, what does this mean? |
||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i16 | ||
return: i16 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i32 | ||
return: i32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: i64 | ||
return: i64 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp32 | ||
return: fp32 | ||
- args: | ||
- options: [ SILENT, SATURATE, ERROR ] | ||
required: false | ||
- value: fp64 | ||
return: fp64 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,7 +44,8 @@ scalar_functions: | |
- value: i32 | ||
- value: i32 | ||
return: "string" | ||
- name: starts_with | ||
- | ||
name: starts_with | ||
description: Whether this string starts with another string. | ||
impls: | ||
- args: | ||
|
@@ -163,3 +164,29 @@ scalar_functions: | |
- value: "fixedchar<L1>" | ||
- value: "varchar<L2>" | ||
return: "BOOLEAN" | ||
- | ||
name: lower | ||
description: Transforms the string to lower case characters | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add some additional definition here of how lower case is defined. I assume there is a utf definition? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Arrow defers to utf8proc which is somewhat sparse on details. However, you are correct, there is a standard UTF-8 way of lowercasing, though it sometimes does the wrong thing semantically. |
||
impls: | ||
- args: | ||
- value: "varchar<L1>" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we add fixedchar? |
||
return: "varchar<L1>" | ||
- args: | ||
- value: "string" | ||
return: "string" | ||
- args: | ||
- value: "fixedchar<L1>" | ||
return: "fixedchar<L1>" | ||
- | ||
name: upper | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comments as on lower. |
||
description: Transforms the string to upper case characters | ||
impls: | ||
- args: | ||
- value: "varchar<L1>" | ||
return: "varchar<L1>" | ||
- args: | ||
- value: "string" | ||
return: "string" | ||
- args: | ||
- value: "fixedchar<L1>" | ||
return: "fixedchar<L1>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... I suppose this exists because of the uneveness of twos-compliment but ugh...