-
-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(formatValue): fix the ordering of +/- symbols when formatting currency values #1376
Conversation
@ikesau We have a "SVG diff / svgTester" that runs on PRs, and for this one it found 64 differences. You can click on "Details" in the test run above, and then visit the link of the branch containing the SVGs, and then and then open the latest commit there. One category of differences is that the And some of the differences are exactly the intended fix – putting the For this type of change I'd maybe review 30 of the charts to see if any unintended changes are introduced. You can select the "rich diff" tab in the GitHub to make it easier to find differences (although it's a bit slow to do this in the GitHub views unfortunately...) Will do a code review now, but just wanted to give some background how it works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems more tricky to fix than I expected because of the recursive calls. I left some comments, but didn't think of ideas for a potentially good fix...
Just trying to document the output differences:
|
I would say few people know what these mean, and we shouldn't go any higher than trillion? Especially the short number prefixes, which I had never seen before. There is still the issue that in production currently we have some large values like (The ideal fix is to switch to scientific notation (#1171) for large magnitudes, but definitely out of scope here.)
I would say these two are bugs that need fixing? 😬 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is much neater! I hope this task didn't take a huge hit on motivation. 😬 (especially since I have a few more comments... 👀)
There are two things I'm wondering about:
-
Should we discard short number abbreviations if the unit is
%
? (chart)I think we don't do it for long abbreviations, e.g. we don't output
1 million %
but1,000,000%
, so we also probably shouldn't output1k%
. -
We are getting an additional decimal place in some cases in the COVID data explorer sidebar:
Is it likely related to the unit being
%
and short abbreviations enabled? Since it's potentially an explorer issue, let me know if you need help debugging!
clientUtils/formatValue.ts
Outdated
short: { | ||
k: "k", | ||
M: "M", | ||
G: "B", | ||
T: "T", | ||
P: "P", | ||
E: "E", | ||
Z: "Z", | ||
Y: "Y", | ||
}, | ||
long: { | ||
k: "k", | ||
M: " million", | ||
G: " billion", | ||
T: " trillion", | ||
P: " quadrillion", | ||
E: " quintillion", | ||
Z: " sextillion", | ||
Y: " septillion", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove all after trillion
(T
)? I don't think they're widely understood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, might this be more complicated since D3 is outputting them?
Could we tell D3 to disable SI prefixes if a value is quadrillion or higher, or are there cases where it could output 0.9P
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit tricky unfortunately. I agree, no one really knows these prefixes (hence remapping G to B) and there was work done in a d3-format PR to introduce this functionality natively, but I don't think there's any out-of-the-box way to set a threshold on abbreviation.
Some options we have:
- Switch to exponential notation after trillion (also handling cases where 9.5 trillion would get rounded to 1 quadrillion if precision is set to 1)
- Re-implement the logic to get it to output 1,234T (possibly quite hard to do with this flow)
- Change the prefixes from
P
tokT
orquad
orthousand trillion
or whatever.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't think we should be switching to exponential notation in this PR. As we discussed privately, when using exponential notation all values should be in exponential notation, not just the large ones. Also, the default formatting, e.g.
5e3
is problematic, not many people understand that it means5 × 10^3
(we could rewrite that, but it's another potential can of worms). - Yes the ideal fix is probably that it outputs
1,234 trillion
as it currently does. thousand trillion
is worse than what we have now (which is no double prefixes until we reachmillion trillion
, which in itself is a bug we wanted to fix when eventually there's an important enough use case)
This issue makes me wonder if we want to rely on a 3rd party library to this degree. It's possible there will be another feature in the future that is harder to implement because we have to circumvent d3-format
, and maybe it would force us to re-implement a subset of d3-format
ourselves? What do you think?
We could also maybe fork d3-format
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts:
Agreed on deferring the larger task of identifying when to use scientific notation across an entire graph. Though it won't be hard to update the exponent notation in postprocessString
- output.replace("e", " x 10^")
would do the trick.
I don't think 1,234 million trillion
is really any better than 1.234 sextillion
, though yes, people can at least understand 1,000,000T
unlike 1E
.
These are such rare cases though, and afaik, we don't use short abbreviation for them.
So, given we deal with numbers of this magnitude so rarely (supercomputer-power-flops
, ai-training-computation
, biomass-vs-abundance-taxa
) I think that leaving the long abbreviations as is, and changing 1E
to 1Quint
is the best way forward, and we can fix it for good when we handle those numbers properly which is with scientific notation.
Also, without diving into their code, I don't think we should fork d3-format and fundamentally change how it works. My intuition is that that would require far more effort than any other option. Working with the postprocessString
function will surely be easier than forking and rewriting a d3 project, even for future formatting features.
32524e7
to
852a630
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ike!
This is all great work, and it's interesting how this small issue escalated into a complete rewrite.
However, I am unsure if relying on d3 so much is (still) the right way to go here. We need to do a lot of preprocessing and postprocessing to get out the result we want, and it seems that d3 isn't doing all that much work.
Also, the format specifiers they use are cryptic anyway, and understanding what we need to pass to d3 to get out what is a hard ask.
For these reasons, I feel like we could self-roll our formatValue
function instead - how hard can it be? 😆
We got good test coverage now (thank you!), and we can also make use of the well-supported Intl.NumberFormat, which gives us nice things like number grouping, { signDisplay: "exceptZero" }
and support for significant digits (if we want to go that way).
Let me know what you think, and feel free to discuss if anything's unclear!
["negative with unit", -1, "-$1", { unit: "$" }], | ||
["trailingZeroes true", 1.10, "1.1", { trailingZeroes: false }], | ||
["trailingZeroes false", 1.10, "1.10", { trailingZeroes: true }], | ||
["$ spaceBeforeUnit false", 1.1, "$1.1", { spaceBeforeUnit: false, unit: "$" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test case with { unit: "£" }
, because I'm currently unsure if that is handled correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think no chart uses £
, I tried searching for one a while ago. We could and maybe should just drop it, it's such an odd one to have (why not have €
too?)
Oof, oh no! I wasn't aware of the I've definitely come to understand the ins and outs of d3-format, but yeah obviously it's pretty oblique interface. Let me play around with NumberFormat a little bit and see how it handles before making a decision. |
Okay, so I made a Intl.NumberFormat demo It's basically the same thing. An object that you configure with a bunch of settings and then format with. Some differences:
So, we could rewrite it to ditch d3-format (though it will still be in our bundle because |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, we have a thread of a discussion here around architecture that I started and isn't too productive. I think it would be best to move that discussion elsewhere.
This PR fixes a bug, improves readability significantly, adds comprehensive tests, and in case we find our requirements change in the future in such a way that our custom implementation is better, we can always bring back some of the old implementation.
For now, I think it would be best to merge and deploy as it is. 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let's merge this monster of a PR. It's a big improvement, and as gavrilov says we can still revisit this in the future.
@ikesau The big green Merge button is all yours
Fixes #1330
I rewrote the
formatValue
function to make it more linear and easier to reason about. It now relies much more closely on d3's formatting method:Unit tests have also been added.