Change default of float_precision for read_csv and read_table to "high" #36228

Dr-Irv · 2020-09-08T19:35:57Z

closes read_csv returns different float values for same number #17154
tests added / passed
- modified tests/io/parser/test_c_parser.py to make sure all 4 options are tested
- added tests/io/parser/test_c_parser.py:test_high_is_default
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry
- for version 1.2

See discussion at bottom of #36149 for the performance tests. Added float_precision="legacy" so people can pick up the old parser. Can't change default to "high" because of incompatibility with python parser

pep8speaks · 2020-09-08T20:43:53Z

Hello @Dr-Irv! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-09-11 22:09:32 UTC

WillAyd · 2020-09-09T15:16:50Z

If there is no performance difference should we not just get rid of the legacy parsing altogether?

Dr-Irv · 2020-09-09T16:17:51Z

If there is no performance difference should we not just get rid of the legacy parsing altogether?

My only concern here is that maybe someone has code that inadvertently depends on it, so we have to keep it in there for some form of compatibility. I do think we could deprecate the legacy parsing

jreback · 2020-09-09T16:39:20Z

yep that sounds fine to leave th option

just update the doc string to indicate

pandas/io/parsers.py

WillAyd

lgtm

jreback

minor comments

pandas/io/parsers.py

pandas/tests/io/parser/test_c_parser_only.py

Dr-Irv · 2020-09-11T23:39:29Z

@jreback added the check for an invalid float_precision option and now all green

jreback

great @Dr-Irv very minor comment can be addressed in a followon (if needed)

jreback · 2020-09-13T22:54:01Z

pandas/io/parsers.py

@@ -2299,6 +2299,7 @@ def TextParser(*args, **kwds):
        values. The options are None for the ordinary converter,
        'high' for the high-precision converter, and 'round_trip' for the
        round-trip converter.
+        .. versionchanged:: 1.2


check that tis renders ok, I think need a blank line after

…h" (pandas-dev#36228)

Dr-Irv added 2 commits September 8, 2020 13:32

change read_csv and read_table to use high precision by default

acffef2

Modify test, whatsnew

68ecda3

Dr-Irv added this to the 1.2 milestone Sep 8, 2020

add legacy option for float_precision for C parser

9aa25da

Dr-Irv added 2 commits September 8, 2020 16:45

remove blank line in tst file

afaf031

two spaces before inline comment

fa97aab

Dr-Irv requested a review from jreback September 9, 2020 01:30

WillAyd added the IO CSV read_csv, to_csv label Sep 9, 2020

Dr-Irv commented Sep 9, 2020

View reviewed changes

pandas/io/parsers.py Show resolved Hide resolved

WillAyd approved these changes Sep 9, 2020

View reviewed changes

jreback requested changes Sep 11, 2020

View reviewed changes

pandas/io/parsers.py Show resolved Hide resolved

pandas/tests/io/parser/test_c_parser_only.py Show resolved Hide resolved

jreback requested a review from gfyoung September 11, 2020 12:59

Dr-Irv added 3 commits September 11, 2020 18:08

add test for invalid float_precision option

7f4cf45

Merge remote-tracking branch 'upstream/master' into issue17154

887198d

correct versionadded for 1.2

be5910d

jreback approved these changes Sep 13, 2020

View reviewed changes

jreback merged commit a3c4dc8 into pandas-dev:master Sep 13, 2020

Dr-Irv mentioned this pull request Sep 14, 2020

Fix documentation for new float_precision on read_csv #36358

Merged

5 tasks

Dr-Irv deleted the issue17154 branch September 18, 2020 11:35

kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020

Change default of float_precision for read_csv and read_table to "hig…

d4cd7ef

…h" (pandas-dev#36228)

simonjayhawkins mentioned this pull request Nov 23, 2020

test_c_parser_only on linux py_3.8_32 failing on MacPython.pandas-wheels #36429

Closed

jorisvandenbossche mentioned this pull request Dec 29, 2020

REGR: pd.read_csv segfaults with 1.2 (has worked since before pandas 1.0) #38753

Closed

3 tasks

simonjayhawkins mentioned this pull request Feb 8, 2021

BUG: Reading csv files with numbers with multiple leading zeros losses a lot of precision #39514

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change default of float_precision for read_csv and read_table to "high" #36228

Change default of float_precision for read_csv and read_table to "high" #36228

Dr-Irv commented Sep 8, 2020 •

edited

Loading

pep8speaks commented Sep 8, 2020 •

edited

Loading

WillAyd commented Sep 9, 2020

Dr-Irv commented Sep 9, 2020

jreback commented Sep 9, 2020

WillAyd left a comment

jreback left a comment

Dr-Irv commented Sep 11, 2020

jreback left a comment

jreback Sep 13, 2020

Change default of float_precision for read_csv and read_table to "high" #36228

Change default of float_precision for read_csv and read_table to "high" #36228

Conversation

Dr-Irv commented Sep 8, 2020 • edited Loading

pep8speaks commented Sep 8, 2020 • edited Loading

Comment last updated at 2020-09-11 22:09:32 UTC

WillAyd commented Sep 9, 2020

Dr-Irv commented Sep 9, 2020

jreback commented Sep 9, 2020

WillAyd left a comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Dr-Irv commented Sep 11, 2020

jreback left a comment

Choose a reason for hiding this comment

jreback Sep 13, 2020

Choose a reason for hiding this comment

Dr-Irv commented Sep 8, 2020 •

edited

Loading

pep8speaks commented Sep 8, 2020 •

edited

Loading