Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: 'y_true' and 'y_pred' with just 1 value #7

Open
wasf84 opened this issue Jul 5, 2024 · 3 comments
Open

[BUG]: 'y_true' and 'y_pred' with just 1 value #7

wasf84 opened this issue Jul 5, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@wasf84
Copy link

wasf84 commented Jul 5, 2024

Description of the bug

Hi.
First of all, thanks for that work. It's helping me a lot with my personal project.

I've noticed that when 'y_true' and 'y_pred' both has just 1 single value the code crashes.

I'm working on rainfall-runoff modeling to forecast few days ahead. When I try to forecast just 1 day ahead, it crashes during evaluation step.

Environment:
Windows 11
Python 3.9.7
Permetrics 2.0.0

Thanks again for your attention.

Steps To Reproduce

import numpy as np
from permetrics import RegressionMetric

y_true = np.array([3])
y_pred = np.array([2.5])

evaluator = RegressionMetric()

rmse_1 = evaluator.RMSE(y_true, y_pred)
rmse_2 = evaluator.root_mean_squared_error(y_true, y_pred)
print(f"RMSE: {rmse_1}, {rmse_2}")

mse = evaluator.MSE(y_true, y_pred)
mae = evaluator.MAE(y_true, y_pred)
print(f"MSE: {mse}, MAE: {mae}")

Additional Information

The output cell

{
"name": "IndexError",
"message": "tuple index out of range",
"stack": "---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[237], line 10
7 evaluator = RegressionMetric()
9 ## 3.1 Call specific function inside object, each function has 2 names like below
---> 10 rmse_1 = evaluator.RMSE(y_true, y_pred)
11 rmse_2 = evaluator.root_mean_squared_error(y_true, y_pred)
12 print(f"RMSE: {rmse_1}, {rmse_2}")

File lib\site-packages\permetrics\regression.py:237, in RegressionMetric.root_mean_squared_error(self, y_true, y_pred, multi_output, force_finite, finite_value, **kwargs)
222 def root_mean_squared_error(self, y_true=None, y_pred=None, multi_output="raw_values", force_finite=True, finite_value=1.0, **kwargs):
223 """
224 Root Mean Squared Error (RMSE): Best possible score is 0.0, smaller value is better. Range = [0, +inf)
225
(...)
235 result (float, int, np.ndarray): RMSE metric for single column or multiple columns
236 """
--> 237 y_true, y_pred, n_out = self.get_processed_data(y_true, y_pred)
238 result = np.sqrt(np.mean((y_true - y_pred) ** 2, axis=0))
239 return self.get_output_result(result, n_out, multi_output, force_finite, finite_value=finite_value)

File lib\site-packages\permetrics\regression.py:119, in RegressionMetric.get_processed_data(self, y_true, y_pred, **kwargs)
108 """
109 Args:
110 y_true (tuple, list, np.ndarray): The ground truth values
(...)
116 n_out: Number of outputs
117 """
118 if (y_true is not None) and (y_pred is not None):
--> 119 y_true, y_pred, n_out = du.format_regression_data_type(y_true, y_pred)
120 else:
121 if (self.y_true is not None) and (self.y_pred is not None):

File lib\site-packages\permetrics\utils\data_util.py:22, in format_regression_data_type(y_true, y_pred)
20 if y_true.ndim > 2:
21 raise ValueError("y_true and y_pred must be 1D or 2D arrays.")
---> 22 return y_true, y_pred, y_true.shape[1] # n_outputs
23 else:
24 raise ValueError("y_true and y_pred must have the same number of dimensions.")

IndexError: tuple index out of range"
}

@wasf84 wasf84 added the bug Something isn't working label Jul 5, 2024
@wasf84
Copy link
Author

wasf84 commented Jul 6, 2024

"I'm sorry. I forgot to mention that this error occurs with any of the metrics, not just the ones I mentioned here."

@thieu1995
Copy link
Owner

@wasf84,
Yes, of course I will crash for any metrics. What is the point of using just 1 value to calculate metric? There are several metrics need to calculate the mean. With 1 value, you can't calculate the mean.
I think you should get enough data before calculate metrics. If you forecast few days ahead. Then spend some more days to get data first then calculate its metrics later.

@wasf84
Copy link
Author

wasf84 commented Jul 24, 2024

Hi @thieu1995

I was trying to implement a Walf Forward Validation using this paper:
https://www.sciencedirect.com/science/article/pii/S259012302400358X?via%3Dihub

I'm forecasting 1 day ahead, calculating the metrics, merging the results in a DataFrame, and so on, through the year of 2023. But I'll try to do what you said, getting more data and calculating the metrics at the end of experiment.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants