Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22781 Support Arbitrary Constant Unit Formatting #3381

Merged
merged 1 commit into from
Feb 12, 2025

Conversation

younies
Copy link
Member

@younies younies commented Feb 10, 2025

Description:

  • Added support for constant denominators in MeasureUnit and LongNameHandler
  • Updated MeasureUnit serialization and product methods to handle constant denominators
  • Added test cases for formatting units with arbitrary constant denominators
  • Added comprehensive test coverage for complex unit formatting scenarios

Related ticket that needs to be addressed: https://unicode-org.atlassian.net/browse/ICU-23039

Checklist

  • Required: Issue filed: ICU-22781
  • Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

younies added a commit to younies/icu that referenced this pull request Feb 10, 2025
@younies younies force-pushed the cpp-format-arbitrary-constants branch from 6b1855c to 29da2c6 Compare February 10, 2025 14:49
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

younies added a commit to younies/icu that referenced this pull request Feb 10, 2025
@younies younies force-pushed the cpp-format-arbitrary-constants branch from 29da2c6 to 5ad9adb Compare February 10, 2025 14:49
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@younies younies requested review from sffc and richgillam February 10, 2025 14:50
younies added a commit to younies/icu that referenced this pull request Feb 10, 2025
@younies younies force-pushed the cpp-format-arbitrary-constants branch from 6b8ba6f to d132924 Compare February 10, 2025 15:21
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few questions...

if (impl.singleUnits.length() > 1) {

if (this->getConstantDenominator(status) != 0 && other.getConstantDenominator(status) != 0) {
// Cannot multiply units that both of them have a constant denominator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one constant denominator in a compound unit.
Therefore, we Cannot multiply units that both of them have a constant denominator

I have updated the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but if you multiply together two units that both have constant denominators, wouldn't the constant denominator of the result just be the product of the original two units' constant denominators? That is, if you multiply liter-per-100-kilometer by liter-per-100-kilometer, wouldn't the result be square-liter-per-10000-square-kilometer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, but I am looking at it from another perspective.

liter-per-100-kilometer multiplied by liter-per-100-kilometer will result in square-liter-per-100-100-square-kilometer, which leads to duplication in unit constants.

but we can also, do it the way you have suggested, but because I need to check that the multiplication should not exceed LONGMAX, let's add a TODO and try to fix it later. (we can quickly discuss it too in the ICU meeting)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be interesting to hear what Mark thinks. But I'm fine with what you're suggesting here.

}

impl.constantDenominator =
this->getConstantDenominator(status) + other.getConstantDenominator(status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add here because you know one or the other of these is going to be 0? If that's what you're doing, I'd recommend documenting it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

u"3 meters per 100-second-kelvin"},
{"meter-per-1000", 1, UNUM_UNIT_WIDTH_FULL_NAME, Locale::getEnglish(), u"1 meter per 1000"},
{"meter-per-1000-second", 1, UNUM_UNIT_WIDTH_FULL_NAME, Locale::getEnglish(),
u"1 meter per 1000-second"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't you have a bunch of logic for handling plurals in the code above? Shouldn't this be "1 meter per 1000 seconds" or something like that? And why is there a hyphen in "1000-second"? We wouldn't have that in English.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, for now, it is just a plain number.

There is an issue for that: https://unicode-org.atlassian.net/browse/ICU-23039

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really weird, but since you have an outstanding ticket to improve this, I'll accept it for now.

u"1 meter per 1-kilogram-second-kelvin"},
{"meter-second-per-kilogram-kelvin", 1, UNUM_UNIT_WIDTH_FULL_NAME, Locale::getEnglish(),
u"1 meter-second per kilogram-kelvin"},
{"meter-second-per-1000-kilogram-kelvin", 1, UNUM_UNIT_WIDTH_FULL_NAME, Locale::getEnglish(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here (among other places): Shouldn't this be "meter-second per 1000 kilogram-kelvins"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, for now, it is just a plain number.

There is an issue for that: https://unicode-org.atlassian.net/browse/ICU-23039

u"1 m⋅sec/1000⋅kg⋅K"},
{"meter-second-per-1000-kilogram-kelvin", 1, UNUM_UNIT_WIDTH_FULL_NAME, Locale::getGerman(),
u"1 Meter⋅Sekunde pro 1000⋅Kilogramm⋅Kelvin"},
{"meter-second-per-1000-kilogram-kelvin", 1, UNUM_UNIT_WIDTH_SHORT, Locale::getGerman(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark had been arguing for special formatting for larger denominators, so that if you had "meter-per-1e6-kilogram", you'd get "meter per 1,000,000 kilograms". Did you do that? If so, you should have a test for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, for now, it is just a plain number.

There is an issue for that: https://unicode-org.atlassian.net/browse/ICU-23039

UnicodeString result =
formatter.locale("en-US").formatDouble(value, status).getOutputUnit(status).getIdentifier();
assertEquals("Testing default -u-mu- for en-US", MeasureUnit::getFahrenheit().getIdentifier(),
result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of a number of cases where the code seems to have been reformatted with no substantive change. Those kinds of things make the diff a lot harder to read. Is the reformatting necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you it is really annoying when the IDE format some part that are not necessary.

Sorry that I did not revert them, I am reverting them now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@younies younies requested a review from richgillam February 11, 2025 15:01
@younies younies changed the title ICU-22781 Support Arbitrary Constant Unit Formatting (C++) ICU-22781 Support Arbitrary Constant Unit Formatting Feb 11, 2025
younies added a commit to younies/icu that referenced this pull request Feb 11, 2025
@younies younies force-pushed the cpp-format-arbitrary-constants branch from 9f703cf to 4e1af73 Compare February 11, 2025 18:51
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

richgillam
richgillam previously approved these changes Feb 11, 2025
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think. I still think you can do the "right thing" in product(), but I don't wasn't to hold this up over that.

// "meter per 100 seconds" should be evaluated for correct singular/plural
// usage: "second" or "seconds"?
// Similarly, "kilogram per 1000 meters" should be checked for "meter" or
// "meters"?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the TODO here. And if it was there all along, I'm sorry I didn't notice it and put two and two together before I added my previous comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, it was there from the first version , but no problem :)

@younies younies force-pushed the cpp-format-arbitrary-constants branch from e323dc8 to 6954aac Compare February 11, 2025 22:54
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@younies younies requested a review from richgillam February 11, 2025 22:54
@younies younies merged commit 35c9778 into unicode-org:main Feb 12, 2025
101 checks passed
@younies younies deleted the cpp-format-arbitrary-constants branch February 12, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants