-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom Calculation functions #2900
Comments
PS. Use case: I would like to be able to define a function which accesses external data. |
Useful for you perhaps, but it's a lot more significant as a feature to make it useful for others. Why do you need this to be something that can be defined as a function for the calculation engine? The purpose of teh Calculation Engine is for evaluating spreadsheet formulae, so why do you need to add new functions that don't exist in Excel itself? What functions do you have in spreadsheet formulae that don't exist in MS Excel already? I can see value with functions between different spreadsheet applications, e.g. ODS supports a But I can also see problems when saving that in a format that doesn't support those functions... if a formula uses a custom function, then it will fail when that file is opened in MS Excel because the function isn't available in Excel (or unless we fully expand to support any UDF, which means supporting VBA as well). |
Also related to pull request #2734. |
I was just about to point you to the "flyweight" experiments. 🙂 Those are very much experimental; and while their purpose was to reduce memory usage in the calculation engine, they would make it a lot easier to extend the formula function set with custom functions. There's still problems with those experiments, they're not specially good implementations, the expected memory savings aren't visible with the test suite (although I believe they would be there with real world applications) and the 500 files in one folder isn't healthy for PHPStorm. But I haven't dismissed the concept (which is why I haven't deleted the branches); and the principles (particularly for experiment #2) could be extended to allow registering additional functions more easily. The problem of load an Ods, register |
A great question - thank you. I'm writing a custom reporting system that pulls in data from a range of different systems/sources. However, how the data will be queried and presented needs' to be within the users' control. With a few small modifications, phpSpreadsheet makes an excellent tool in that regard(!) as the calculation engine offers a great deal of power and flexibility. I've made the mods above to a local copy and, as you say, they work for me. In addition, I have overridden Spreadsheet::getSheetByName() and used it to "lazy load" sheets dynamically from sheets from both stored in a database and dynamically generated data from elastic search, mysql, postgresql, etc. I also hope to dynamically query data from other sources such as Google Sheets. The ability to define custom functions allows users to pull in calculated values rather than whole data sets. As proof of concept, I created an AGGREGATE function which takes the name of a data source, field and aggregation method. When "lazy loading" sheets, I tried to define local named ranges for the imported data (ie. column names) but they don't appear to work as expected ie. if I reference the sheet ImportedWorksheet!COLUMN_NAME it won't resolve. I'm not sure if it's me or a bug. As a workaround - and proof of concept only - I created a DCOL function which takes the name of the worksheet and column and returns the appropriate cell range. Finally, when lazy loading, it would be useful to support the external worksheet syntax eg. [External]Worksheet!A1:C3 as this would allow greater control over where the data is dynamically pulled in from. From you side, this would mean small changes and no change in functionality - just so long as there's a method to override to return the imported spreadsheet. |
Agreed. That would be the responsibility of the application, not phpSpreadsheet, as - out of the box - it wouldn't ship with any "non-excel" functions (other than perhaps ODS regex, as you mentioned above). In my use case, I'm leveraging phpSpreadsheet primarily as a calculation engine and to assist with presentation, rather than for loading/saving sheets in different formats... in fact, I'm storing and retrieving only the formulas and number formats from a database of user defined sheets. |
Thank you.
I took a quick look at the pull request and couldn't figure out how it would save memory or time - for me, the big function list is a "flyweight" already. I'm glad to see my instincts weren't wrong there! ;-) The main benefit I see in the experimental implementation was in making the function definitions more self contained ie. not having to refer back to the big function list when reviewing the list. Whether it's worth the additional work and I/O of one class per function is up for debate, I guess. |
PS. While working on this, the thought occurred to me that refactoring the calculation engine (using Interfaces) would allow it to be used independently. I grep'ed through the Calculation folder and only saw a few references to some utility classes and a few phpSpreadsheet specific classes... doesn't look like it would be a huge task (any problematic functions could be omitted from the abstracted calculation engine and defined as phpSpreadsheet specific implementations). ./BinaryComparison.php:use PhpOffice\PhpSpreadsheet\Shared\StringHelper; |
I guess that's a valid use case, especially as I actually provide a quadratic equation evaluator in the
And I do have some concerns about that, and the number of developers that don't take that responsibility, because it does create extra work for me when developers raise issues for something that they should be responsible for I'll certainly consider it, though there's two major updates to the calculation engine that I'm working on that I want to handle first
In theory at least, it eliminates the large And the individual function signatures are maintained as structured objects implementing |
I have looked at that possibility a few times, likewise with separating out Readers and Writers... about 35% of user devs only use Readers, closer to 50% use only Writers, with the balance using both; but nearly 85% use the Calculation Engine (mostly indirectly, without even realising it) so extracting the Calc Engine itself doesn't seem particularly worthwhile. Back in the PhpExcel days, it was much more tightly coupled to the core spreadsheet implementation (and had a cyclomatic complexity of over 21 million)... I've spent a lot of time over the years working on it to reduce that coupling and reduce its complexity, but I don't think it can ever be truly standalone. |
Thank you.
Great to hear that this feature is going to make the cut - as it will help with dynamically imported sheets.
Certainly has! And is an invaluable project for the php community. Thank you. |
Understood. I guess I didn't mean that it would be come a separate project. Rather that it could be refactored so that it can be used independently (and extended / adapted as required). An example of this is Laravel, where the framework is the "super" package, which provides the sub packages in it's composer.json and the sub packages (in the src folder) also contains thier own composer.json files. The whole project is managed with git submodules or subtrees (I can't recall which). Most users pull in the main package - but - sub packages are also valuable to those that don't use the full framework (hence broadening the community / user base). Anyway, not a high priority for me personally - I could just imagine (as I am sure you have) the engine being used in a wide variety of applications. |
PS. The only change I "need" to be able to use the stock package is this one: private function addDefaultArgumentValues(array $functionCall, array $args, array $emptyArguments): array
{
if (is_object($functionCall[0])) {
$reflector = new ReflectionMethod($functionCall[0], $functionCall[1]);
} else {
$reflector = new ReflectionMethod(implode('::', $functionCall));
}
\\ ... As I use this evil hack to override phpSpreadsheetFunctions in my own Spreadsheet class: public function registerFunction(string $name, array $definition)
{
$engine = $this->getCalculationEngine();
$name = strtoupper($name);
$functions = $engine->getFunctions();
$functions[$name] = $definition;
$refObject = new ReflectionObject($engine);
$target = $refObject->getParentClass();
$refProperty = $target->getProperty('phpSpreadsheetFunctions');
$refProperty->setAccessible(true);
$refProperty->setValue($engine, $functions);
return $this;
} |
+1 on this request. |
@robertotremonti that's our use case also - as mentioned, for the time being, the hack above works for us. On the output side, we currently use only number formatting and a custom html writer. |
Thanks for the suggestion @looksystems. |
This is:
What is the expected behavior?
Add ability to add own calculation functions / implementations.
What is the current behavior?
Currently, Calculation class (and it's usage within Spreadsheet) is private and cannot be overridden or extended without changes to the source code.
In particular, Calculation::$phpSpreadsheetFunctions is private and there are no methods available to add/override function.
Proposal: add the following method to the Calculation class...
ALSO
Calculation::addDefaultArgumentValues only supports static methods (for Calculation functions):
** Proposal:** modify addDefaultArgumentValues...
What are the steps to reproduce?
N/A
What features do you think are causing the issue
Does an issue affect all spreadsheet file formats? If not, which formats are affected?
Yes
Which versions of PhpSpreadsheet and PHP are affected?
Latest PhpSpreadsheet and all supported versions of php.
The text was updated successfully, but these errors were encountered: