Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate Shared\JAMA #3260

Merged
merged 4 commits into from
Dec 30, 2022
Merged

Eliminate Shared\JAMA #3260

merged 4 commits into from
Dec 30, 2022

Conversation

oleibman
Copy link
Collaborator

@oleibman oleibman commented Dec 23, 2022

The code under Shared\JAMA is called from exactly 3 places in Calculation, and exactly 1 other place (see next paragraph). These places are inadequately covered in the test suite, so it is not clear that the existing code works. In addition, see PR #2964 for commentary decrying the continued used of JAMA. I will also note that it has a pitiful 8.20% coverage in the test suite. There seems to already be a perfectly adequate equivalent in Calculation for those parts of JAMA which are used (e.g. we don't use any decompositions, so the fact that no decomposition code is present in Calculation is not a problem). This PR replaces the uses of JAMA within Calculation with checkMatrixOperands, and deletes Shared/JAMA entirely (which, among other thing, makes many Phpstan reported problems go away). The test suite is enhanced to ensure coverage of all the changed statements. A side benefit of deleting Shared/Jama is that the only function which PhpSpreadsheet has added to the global namespace (hypo in Maths.php) goes away.

There is one additional use in Shared/Trend/PolynomialBestFit. That use is best replaced with Matrix/Matrix, which is already a requirement for PhpSpreadsheet. PolynomialBestFit has zero test coverage, and I didn't add any for this change. There seem to be existing errors which both Phpstan and Scrutinizer complain about, and I am quite convinced that they are not false positives. Fixing those problems will be a project for another day.

Calculation has many calls to Information\ErrorValue and Information\ExcelError. It is inconsistent in how it does so - most invoke Information\E... but a small number take advantage of additional use statements to just invoke E.... I have made the usage consistent by changing the small number to act like the majority and eliminating the use statements.

Some of the test cases exposed the fact that MAX, MAXA, MIN, and MINA do not properly handle error strings in their input. For example, if cell A1 contains 2, and A2 contains =5/0, =MAX(A1, A2) should return #DIV/0!. They are changed to handle them properly.

Shared\JAMA was normally omitted from code coverage. Since it is being deleted, there is no longer a reason to explicitly exclude it. Writer\PDF was also on the exclude list, probably for historical reasons, but there is no reason to exclude it now (it is, in fact, 100% covered), so it will no longer be excluded.

This is:

- [x] a bugfix
- [ ] a new feature
- [ ] refactoring
- [x] additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change
  • Documentation is updated as necessary

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

The code under `Shared\JAMA` is called from exactly 3 places in Calculation, and exactly 1 other place (see next paragraph). These places are inadequately covered in the test suite, so it is not clear that the existing code works. In addition, see PR PHPOffice#2964 for commentary decrying the continued used of JAMA. I will also note that it has a pitiful 8.20% coverage in the test suite. There seems to already be a perfectly adequate equivalent in Calculation for those parts of JAMA which are used (e.g. we don't use any decompositions, so the fact that no decomposition code is present in Calculation is not a problem). This PR replaces the uses of JAMA within Calculation with `checkMatrixOperands`, and deletes Shared/JAMA entirely (which, among other thing, makes many Phpstan reported problems go away). The test suite is enhanced to ensure coverage of all the changed statements. A side benefit of deleting Shared/Jama is that the only function which PhpSpreadsheet has added to the global namespace (hypo in Maths.php) goes away.

There is one additional use in Shared/Trend/PolynomialBestFit. That use is best replaced with Matrix/Matrix, which is already a requirement for PhpSpreadsheet. PolynomialBestFit has zero test coverage, and I didn't add any for this change. There seem to be existing errors which both Phpstan and Scrutinizer complain about, and I am quite convinced that they are not false positives. Fixing those problems will be a project for another day.

Calculation has many calls to `Information\ErrorValue` and `Information\ExcelError`. It is inconsistent in how it does so - most invoke `Information\E...` but a small number take advantage of additional `use` statements to just invoke `E...`. I have made the usage consistent by changing the small number to act like the majority and eliminating the `use` statements.

Some of the test cases exposed the fact that MAX, MAXA, MIN, and MINA do not properly handle error strings in their input. For example, if cell A1 contains 2, and A2 contains `=5/0`, `=MAX(A1, A2)` should return `#DIV/0!`. They are changed to handle them properly.

`Shared\JAMA` was normally omitted from code coverage. Since it is being deleted, there is no longer a reason to explicitly exclude it. `Writer\PDF` was also on the exclude list, probably for historical reasons, but there is no reason to exclude it now (it is, in fact, 100% covered), so it will no longer be excluded.
Eliminate some newly dead code.
@oleibman oleibman merged commit 4adafec into PHPOffice:master Dec 30, 2022
@oleibman oleibman deleted the jamadel branch February 10, 2023 05:48
@martinf55
Copy link

The is_numeric check within the unary operator loop over arrays seems too restrictive. Previously it allowed arrays of booleans to be coerced into numerics, which in turn supported the double negation "trick" in SUMPRODUCT formula:

=SUMPRODUCT( --(C34:C57=1), --(B34:B57="SearchString") )

Changing it to

if (is_numeric($result[$row][$column]) || is_bool($result[$row][$column]) ) {

allows the array of booleans, but I'm wondering if it shouldn't be validated using the pattern used in validateBinaryOperand():

        //    Numbers, matrices and booleans can pass straight through, as they're already valid
        if (is_string($operand)) {
            // Try to convert from string to numeric, possibly using convertToNumberIfFormatted()

@oleibman
Copy link
Collaborator Author

@martinf55 Please open this as a new issue, and include a sample spreadsheet and/or code which demonstrates the problem.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request Feb 22, 2023
Fix PHPOffice#3389. It seems some functionality was left behind when JAMA was eliminated (PR PHPOffice#3260). In particular, it is apparently a known trick to use double negation on boolean values as arguments to functions like SUMPRODUCT.
oleibman added a commit that referenced this pull request Feb 24, 2023
* Coerce Bool to Int for Unary Operation on Arrays

Fix #3389. It seems some functionality was left behind when JAMA was eliminated (PR #3260). In particular, it is apparently a known trick to use double negation on boolean values as arguments to functions like SUMPRODUCT.

* Fix 3396

Treat booleans in arrays as int for mathematical operators as well.

* Edge Case

When array operand was neither numeric nor boolean, PhpSpreadsheet had always been evaluating the operand as #NUM!. It will now propagate an error string like #DIV/0!, and treat non-error strings as #VALUE!, consistent with Excel.
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request Feb 22, 2024
Fix PHPOffice#3909. SUMPRODUCT is mishandling multi-row ranges. In Calculation/Calculation, `checkMatrixOperands` will often resize its operands. When it does so, it needs to recalculate the dimensions of each. This fixes the reported problem.

Likely cause was PR PHPOffice#3260. That ticket noted the poor coverage of the code being replaced. Tests of the problem in this ticket were absent and are now added. Despite this, I note that `resizeMatricesShrink` is virtually uncovered, and `resizeMatricesExpand` has substantial gaps in its coverage. I have covered some, but not all, of the Expand gaps. I am struggling to come up with examples to fill its remaining gaps and those for Shrink. However, I will merge this fix in about a week even if I don't succeed.
@oleibman oleibman mentioned this pull request Feb 22, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants