`PlotlyJSONEncoder` always casts values to float64 due to using `tolist()` #3232

nicolaskruchten · 2021-06-08T11:58:14Z

Regarding the numpy floating point precision and that PlotlyJSONEncoder always casts those to float64 due to using tolist()...

This had always bugged me, as it resulted in much larger exports (i.e. html / ipynb file sizes) than necessary (when float16 or float32 is sufficient) and affected not only coordinate data, but also marker sizes, meta info, etc.

Just in case the plotly.py devs or others are interested: I had found a way to avoid this number inflation by modifying (& monkey patching) the encode_as_list method:

@staticmethod
def encode_as_list_patch(obj):
    """Attempt to use `tolist` method to convert to normal Python list."""
    if hasattr(obj, "tolist"):

        numpy = get_module("numpy")
        try:
            if isinstance(obj, numpy.ndarray) \
               and obj.dtype == numpy.float32 or obj.dtype == numpy.float16 \
               and obj.flags.contiguous:
                return [float('%s' % x) for x in obj]
        except AttributeError:
            raise NotEncodable

        return obj.tolist()
    else:
        raise NotEncodable

It's about 30-50x slower than .tolist(), but - being in the order of a few μs - still much faster than the json encoding, with the benefit of ~3x smaller exports.

I always wanted to report this, and this PR revived the topic. Could this be relevant for a new issue (especially since orjson will not become the default)?

FYI: for reference, a quick search revealed that a patch of encode_as_list was already suggested before: #1842 (comment), in the context of treating inf & NaN, which got brought up again in #2880 (comment).

Originally posted by @mherrmann3 in #2955 (comment)

The text was updated successfully, but these errors were encountered:

RRiva · 2022-09-28T08:09:40Z

Hi @nicolaskruchten I just wanted to thank you for this nice solution 🙂 My animations went from the original 1500 KB to 900 KB in single precision and 700 KB in half precision, without a visible loss in quality. It would be very nice to have this code merged. The best way to do it is of course open for discussion. On the one hand, it is natural to respect the array type, but on the other I wonder how many users will take advantage of it. An alternative is to add a keyword precision to write_html(), and do the casting/rounding internally. What do you think about it?

For future reference, here is how to apply the monkey patch.

import importlib
mod_plty = importlib.import_module('_plotly_utils.utils', 'plotly')

# Code from above.
@staticmethod
def encode_as_list_patch(obj):
    """Attempt to use `tolist` method to convert to normal Python list."""
    if hasattr(obj, "tolist"):
        try:
            if isinstance(obj, np.ndarray) \
               and obj.dtype == np.float32 or obj.dtype == np.float16 \
               and obj.flags.contiguous:
                return [float('%s' % x) for x in obj]
        except AttributeError:
            raise mod_plty.NotEncodable

        return obj.tolist()
    else:
        raise mod_plty.NotEncodable


mod_plty.PlotlyJSONEncoder.encode_as_list = encode_as_list_patch


# Convert the numpy array to single precision.
arr_single = arr.astype(np.float32)

# Or half precision.
arr_half = arr.astype(np.float16)

# Afterwards, call write_html() as always.

gvwilson · 2024-07-05T12:32:35Z

Hi - we are trying to tidy up the stale issues and PRs in Plotly's public repositories so that we can focus on things that are still important to our community. Since this one has been sitting for a while, I'm going to close it; if it is still a concern, please add a comment letting us know what recent version of our software you've checked it with so that I can reopen it and add it to our backlog. Alternatively, if it's a request for tech support, please post in our community forum. Thank you - @gvwilson

RRiva · 2024-07-08T06:58:16Z

Hi @gvwilson, if I understand well, the float precision is handled correctly by selecting the orjson engine. Unfortunately, only plotly.io.write_json() accepts the engine argument, while plotly.io.write_html() doesn't. How can I specify this engine when writing a html file?

Thanks 🙂

gvwilson · 2024-07-08T11:40:01Z

Hi @RRiva - I don't have an answer for you right now, but I'll reopen this and add it to our backlog and try to find one for you. Cheers - @gvwilson

RRiva · 2024-07-08T11:42:04Z

Thanks so much 😄

nicolaskruchten mentioned this issue Jun 8, 2021

JSON encoding refactor and orjson encoding #2955

Merged

1 task

gvwilson closed this as completed Jul 5, 2024

gvwilson reopened this Jul 8, 2024

gvwilson assigned marthacryan Jul 8, 2024

gvwilson added the P3 backlog label Aug 12, 2024

gvwilson changed the title ~~Regarding the numpy floating point precision and that PlotlyJSONEncoder always casts those to float64 due to using tolist()...~~ PlotlyJSONEncoder always casts values to float64 due to using tolist() Aug 12, 2024

gvwilson unassigned marthacryan Aug 12, 2024

gvwilson added the feature something new label Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`PlotlyJSONEncoder` always casts values to float64 due to using `tolist()` #3232

`PlotlyJSONEncoder` always casts values to float64 due to using `tolist()` #3232

nicolaskruchten commented Jun 8, 2021

RRiva commented Sep 28, 2022

gvwilson commented Jul 5, 2024

RRiva commented Jul 8, 2024

gvwilson commented Jul 8, 2024

RRiva commented Jul 8, 2024

PlotlyJSONEncoder always casts values to float64 due to using tolist() #3232

PlotlyJSONEncoder always casts values to float64 due to using tolist() #3232

Comments

nicolaskruchten commented Jun 8, 2021

RRiva commented Sep 28, 2022

gvwilson commented Jul 5, 2024

RRiva commented Jul 8, 2024

gvwilson commented Jul 8, 2024

RRiva commented Jul 8, 2024

`PlotlyJSONEncoder` always casts values to float64 due to using `tolist()` #3232

`PlotlyJSONEncoder` always casts values to float64 due to using `tolist()` #3232