-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with array dimension error in regression models #1297
Comments
Hi @PanyiDong, Seems interesting and at a glance I'm not sure why this hasn't been an issue before, it would make sense that the estimator predicts a 1d output For reference This is further confirmed by checking the source code of Your solution should work for single output regression but I'll need to test properly to make a solution that also works for multi-output regression. I'll also have to check why the tests have not caught this before. Many thanks, |
Hi @PanyiDong, Sorry for the slow response to this. Turns out that indeed it was the |
Fixed with #1335 |
Describe the bug
I'm calling some of the regression methods provided in auto-sklearn for my project and the error shows when using mlp/libsvm_svr/sgd, the exact error message is (omitted the returned 1D array):
for
autosklearn/pipeline/components/regression/mlp.py
,autosklearn/pipeline/components/regression/libsvm_svr.py
andautosklearn/pipeline/components/regression/sgd.py
To Reproduce
Test data: https://www.kaggle.com/tejashvi14/medical-insurance-premium-prediction/download
Using "PremiumPrice" as response/y and other variables as features/X
Fit stage (Time limit just to save time, I don't expect it can return anything meaningful.)
Predict Stage
The training stage will return enormous amount of
[WARNING] [2021-11-09 15:14:31,628:Client-AutoMLSMBO(1)::079213e7-41a2-11ec-97c8-00155d1712a6] Configuration 119 not found
(with different numbers at 119 position).And for AutoSklearnRegressor, predict will just return a (n_sample, ) numpy array with all same elements (close to mean of response but not exact the same), which I don't think is completed as intended.
Returns of the test predict stage (only taken first few lines, others are just the same)
Reason for the Problem
I think the problem is caused by standardization (
sklearn.preprocessing.StandardScaler
) used inautosklearn/pipeline/components/regression/mlp.py
,autosklearn/pipeline/components/regression/libsvm_svr.py
andautosklearn/pipeline/components/regression/sgd.py
Code below extracted from
autosklearn/pipeline/components/regression/sgd.py
, iterative_fit, line 92-95And in predict method, line 131-132
Y_pred is returned by predict method, a (n_sample, ) numpy array, while the inverse_transform of StandardScaler requires a (n_sample, 1) array. Correction should be something like:
I think mlp/libsvm_svr have the same problem.
Environment and installation:
The text was updated successfully, but these errors were encountered: