`#NUM!` values in `amount` and `loc` columns in XLSX are imported as the number 36 by the Activity Browser. #514

mixib · 2021-02-01T16:39:44Z

Consider an XLSX file in the brightway import format, with the value #NUM! for some of the exchanges amount and loc columns.

Following the way in which amounts are calculated based on formulas containing parameters on import, I expected that the amount would be calculated as a mean value based on the uncertainty data. Instead it shows as the number 36 on the Activity Browser. Only when modifying the uncertainty fields exchange by exchange there is a prompt to update the amount to match with the mean.

According to brightway's documentation, the uncertainty type fields are defined in the package stats_arrays. When using a triangular distribution (ID-5), it is expected that when the mode is not provided, the loc value becomes the average between the minimum and the maximum values. Then, I expected #NUM! in the loc field to be replaced by the calculated loc. Instead it shows as the number 36 on the Activity Browser.

Is this normal behaviour? If so, is it documented that the LCA results tab reports on calculations based on the amount and not on the uncertainty fields?

The text was updated successfully, but these errors were encountered:

dgdekoning · 2021-02-02T16:19:02Z

This... is pretty bizarre. Can you forward the information about how distributions make up for missing data?

So this #NUM! field is always translated to 36 during the import? That shouldn't be normal behaviour.
But otherwise it kind of is normal behaviour that the AB (and I think brightway) make use of the amount field by default and only look into the uncertainty fields when performing Monte Carlo analysis or similar.

mixib · 2021-02-04T16:24:26Z

Hi @dgdekoning, I edited my original reply since I looked further into it and found more information.

Static vs stochastic LCA

But otherwise it kind of is normal behaviour that the AB (and I think brightway) make use of the amount field by default and only look into the uncertainty fields when performing Monte Carlo analysis or similar.

It makes sense since the default calculation is static. Indeed, the brightway documentation on the stochastic LCA mentions that the random numbers created according to the uncertainty fields overwrite the amount field.

Sub-issue 1: The bw/AB importer interprets XL error codes as numbers

So this #NUM! field is always translated to 36 during the import?

Yes. In fact, every error value in an XLS/XLSX file translates to a specific number. It is irrelevant if the error value is under amount or under an uncertainty field. Here is some information on the xlrd error codes matching the error values. However, it is very strange that bw/AB read them as numbers when xlrd uses the error tag. Parsing all of them to np.nan would be more appropriate, right?

When the uncertainty fields are used, the bw/AB exporter may fill some fields with #NUM values. The problem comes when the XLSX file is imported back to bw/AB because the #NUM values are parsed as the number corresponding to the xlrd error codes instead of an empty field.

This is how it looks like at the xlrd level:

import xlrd
book = xlrd.open_workbook(myfile.xlsx)
sh = book.sheet_by_index(0)

for rx in range(sh.nrows):
    print(sh.row(rx))

Out:
[text:"Content is '=#NUM'", error:36]
[text:"Content is '=#N/A'", error:42]
[text:"Content is '=#NULL!'", error:0]
[text:"Content is '=1/0'", error:7]
[text:"Content is '=#REF!'", error:23]
[text:"Content is '=#NAME?'", error:29]
[text:"Content is '=#VALUE!'", error:15]
[text:"Content is '=PI()'", number:3.141592653589793]
[text:"Content is '42'", number:42.0]

Sub-issue 2: Outdated documentation of bw's `stats_arrays`

Can you forward the information about how distributions make up for missing data?

It seems that the documentation of the stats_arrays package is not up to date. The table under 'Mapping parameter array columns to uncertainty distributions' provides an overview of the supposedly required and optional parameters. If the optional parameters aren't specified, then these parameters are supposed to be calculated by default. See an excerpt of the table below.

However, testing with an example, if no loc value is specified for the triangular uncertainty or if it is np.nan, then the random number generator yields nan. And if there is no minimum value specified for the uniform or triangular uncertainty, or if it is np.nan, then there is an ImproperBoundsError message associated to these lines:

https://github.com/brightway-lca/stats_arrays/blob/68519a26e6ffe5ed7cefbaadbcdcc048efe75994/stats_arrays/distributions/base.py#L301-L306

This is how it looks like at the stats_arrays level:

from stats_arrays import *
import numpy as np

my_variable = UncertaintyBase.from_dicts(
    {'uncertainty_type': TriangularUncertainty.id,
     'loc': np.nan,#valid input, unexpected output
     'minimum': 15.0, #valid input
     'maximum': 25.0},
    {'uncertainty_type': DiscreteUniform.id,
     'minimum': np.nan, #valid input, expected output
     'maximum': 25.0},
    {'uncertainty_type': UniformUncertainty.id,
     # 'minimum': np.nan, #invalid input
     'minimum': 15, # valid input
     'maximum': 25.0}
)
my_rng = MCRandomNumberGenerator(my_variable)
print(list(zip(my_rng, range(10))))

Out:
[(array([        nan,  7.        , 17.37254671]), 0),
 (array([        nan, 18.        , 22.36840769]), 1),
 (array([        nan, 12.        , 22.60361591]), 2),
 (array([        nan,  7.        , 22.05667132]), 3),
 (array([        nan,  3.        , 16.50150731]), 4),
 (array([        nan, 23.        , 17.09835987]), 5),
 (array([        nan,  1.        , 20.91978931]), 6),
 (array([        nan, 19.        , 18.35423132]), 7),
 (array([        nan, 21.        , 21.61874973]), 8),
 (array([        nan, 22.        , 19.60915629]), 9)]

I think that the triangular and uniform uncertainty are missing the files to deal with the variables marked as optional. For instance, there is a file for the discrete uniform distribution additional to the base.py file. In the additional file, the lack of the optional minimum value is addressed in the next lines:

https://github.com/brightway-lca/stats_arrays/blob/68519a26e6ffe5ed7cefbaadbcdcc048efe75994/stats_arrays/distributions/discrete_uniform.py#L29-L33

Zoophobus · 2023-01-25T12:03:04Z

Hi Mixib,

so I've been going through this and am finding that brightway (version 2) is dealing with these errors slightly differently now. When I've introduced them what occurs is that brightway excludes the 'amount' field, which is personally not an ideal situation, however it is safe and does allow us to throw some warnings/errors.

What I would propose to do myself is to introduce an error message that points to the activity in the database with an exchange that is absent (so should probably contain an error)

marc-vdm · 2023-09-14T15:17:03Z

Closing as resolved in #921

dgdekoning added brightway Issues related to interaction with brightway2 code bug Issues/PRs related to bugs labels Feb 2, 2021

StpdFox self-assigned this Jul 7, 2021

mixib mentioned this issue Jul 9, 2021

Monte Carlo simulation fails due to shape mismatch when considering parameters #615

Open

marc-vdm unassigned StpdFox Sep 22, 2022

Zoophobus mentioned this issue Jan 25, 2023

Alternative sdf importers #921

Merged

3 tasks

Zoophobus added the data imports Issues related to importing data label Feb 9, 2023

marc-vdm closed this as completed Sep 14, 2023

marc-vdm mentioned this issue Jan 13, 2024

Excel database exporter broken #1202

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`#NUM!` values in `amount` and `loc` columns in XLSX are imported as the number 36 by the Activity Browser. #514

`#NUM!` values in `amount` and `loc` columns in XLSX are imported as the number 36 by the Activity Browser. #514

mixib commented Feb 1, 2021

dgdekoning commented Feb 2, 2021

mixib commented Feb 4, 2021 •

edited

Loading

Zoophobus commented Jan 25, 2023

marc-vdm commented Sep 14, 2023

#NUM! values in amount and loc columns in XLSX are imported as the number 36 by the Activity Browser. #514

#NUM! values in amount and loc columns in XLSX are imported as the number 36 by the Activity Browser. #514

Comments

mixib commented Feb 1, 2021

dgdekoning commented Feb 2, 2021

mixib commented Feb 4, 2021 • edited Loading

Static vs stochastic LCA

Sub-issue 1: The bw/AB importer interprets XL error codes as numbers

Sub-issue 2: Outdated documentation of bw's stats_arrays

Zoophobus commented Jan 25, 2023

marc-vdm commented Sep 14, 2023

`#NUM!` values in `amount` and `loc` columns in XLSX are imported as the number 36 by the Activity Browser. #514

`#NUM!` values in `amount` and `loc` columns in XLSX are imported as the number 36 by the Activity Browser. #514

mixib commented Feb 4, 2021 •

edited

Loading

Sub-issue 2: Outdated documentation of bw's `stats_arrays`