Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Decimal to CsvWriter and improve debug display #406

Merged
merged 3 commits into from
Jun 13, 2021

Conversation

alippai
Copy link
Contributor

@alippai alippai commented Jun 4, 2021

Which issue does this PR close?

Closes #405.

Rationale for this change

A simple CSV serializer for Decimals and matching debug format improvement.

What changes are included in this PR?

Decimal -> string conversion.

Are there any user-facing changes?

The debug display of DecimalArray(precision:10, scale:2) is changed from:

12345

to

123.45

Two things to consider for any reviewer:

  1. I might be wrong with try_into() and as i32 and other cast. I'm new to Rust and I don't know the exact implications of my introduced changes.
  2. For integer-like decimals I still write 1.0 instead of 1. This was a subjective decision, I don't have strong feelings about it. Padding was/is missing as well.

@codecov-commenter
Copy link

codecov-commenter commented Jun 4, 2021

Codecov Report

Merging #406 (83f5f6b) into master (dc5507a) will increase coverage by 0.00%.
The diff coverage is 89.28%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #406   +/-   ##
=======================================
  Coverage   82.65%   82.65%           
=======================================
  Files         164      164           
  Lines       45458    45478   +20     
=======================================
+ Hits        37573    37592   +19     
- Misses       7885     7886    +1     
Impacted Files Coverage Δ
arrow/src/util/display.rs 32.46% <60.00%> (+2.73%) ⬆️
arrow/src/array/array_binary.rs 90.13% <92.85%> (-0.03%) ⬇️
arrow/src/csv/writer.rs 83.15% <100.00%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc5507a...83f5f6b. Read the comment docs.

arrow/src/csv/writer.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is looking good @alippai -- thank you. I had some suggestions / questions on the API -- @nevi-me / @Dandandan do you have any thoughts on the API?

arrow/src/util/display.rs Outdated Show resolved Hide resolved
arrow/benches/csv_writer.rs Show resolved Hide resolved
assert_eq!(
"DecimalArray<23, 6>\n[\n 8887000000,\n -8887000000,\n]",
"DecimalArray<23, 6>\n[\n 8887.000000,\n -8887.000000,\n null,\n]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a significant improvement to me 👍

@alamb alamb mentioned this pull request Jun 6, 2021
@alippai alippai force-pushed the feature/decimal-format branch 2 times, most recently from 05b542f to c858052 Compare June 12, 2021 19:57
@alippai alippai force-pushed the feature/decimal-format branch from c858052 to 83f5f6b Compare June 12, 2021 20:08
@alippai
Copy link
Contributor Author

alippai commented Jun 12, 2021

@alamb Created DecimalArray::value_as_string() instead of the display.rs implementation.

What I couldn't find is the usage of #[inline] vs #[inline(always)] also pub fn value() is not inlined for DecimalArray but it is for primitive arrays. Is there a rule of thumb for this?

@nevi-me nevi-me merged commit fb45112 into apache:master Jun 13, 2021
@alippai
Copy link
Contributor Author

alippai commented Jun 13, 2021

Thanks @nevi-me

@alamb
Copy link
Contributor

alamb commented Jun 13, 2021

Thanks @alippai -- this looks great 👍

alamb pushed a commit that referenced this pull request Jun 19, 2021
* Add Decimal to CsvWriter and improve debug display

* Measure CSV writer instead of file and data creation

* Re-use decimal formatting
alamb pushed a commit that referenced this pull request Jun 20, 2021
* Add Decimal to CsvWriter and improve debug display

* Measure CSV writer instead of file and data creation

* Re-use decimal formatting
@alippai alippai deleted the feature/decimal-format branch June 20, 2021 23:27
alamb added a commit that referenced this pull request Jun 21, 2021
* Add Decimal to CsvWriter and improve debug display

* Measure CSV writer instead of file and data creation

* Re-use decimal formatting

Co-authored-by: Ádám Lippai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Decimal to CsvWriter
4 participants