-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bypass shorter predictions by vw #7
base: master
Are you sure you want to change the base?
Conversation
That looks good to me, thanks for catching that! I may not use the The package does not even have a |
I don't think this worth a separate demo file.
Loading and preparing data from vw_example.R
The code below will represent: For rvw this will give us: Using perf
Throws warning message "Predicted values file is longer." generated py Before PR using
|
Ok, it is a little hard to see what code you'd change in the example. Also, when I currently run |
Without PR I get this error message:
When executiong grid search loop:
To bypass it I changed this code in Probs and labels vectors will be forced to a numeric type.
Before was:
Then the longest vector will be trimmed to the size of the shorter one.
Before was:
Resulting vectors will have equal length and no NA and will be used to compute ROC curve |
I would add the code from my first comment to show how behavior changes with or without |
R> results <- cbind(iter=1:nrow(grid), grid, auc=do.call(rbind, aucs))
R> print(results)
iter l1 l2 eta extra auc
1 1 1e-07 1e-07 0.10 --nn 10 0.996496
2 2 1e-08 1e-07 0.10 --nn 10 0.996496
3 3 1e-07 1e-08 0.10 --nn 10 0.996496
4 4 1e-08 1e-08 0.10 --nn 10 0.996496
5 5 1e-07 1e-07 0.05 --nn 10 0.995664
6 6 1e-08 1e-07 0.05 --nn 10 0.995664
7 7 1e-07 1e-08 0.05 --nn 10 0.995664
8 8 1e-08 1e-08 0.05 --nn 10 0.995664
9 9 1e-07 1e-07 0.10 0.987865
10 10 1e-08 1e-07 0.10 0.991949
11 11 1e-07 1e-08 0.10 0.987865
12 12 1e-08 1e-08 0.10 0.991949
13 13 1e-07 1e-07 0.05 0.988334
14 14 1e-08 1e-07 0.05 0.991517
15 15 1e-07 1e-08 0.05 0.988334
16 16 1e-08 1e-08 0.05 0.991517
R> |
Also:
So I am at a loss. You you please prepare a self-contained script exhibit a bug? |
If I understand correctly, I get this error because of
I will try to see if I get the same bug on other platforms. This R script fails for me right now:
with this error message:
My system:
|
I am at Let me see if I can quickly build 8.5.0. |
Unchanged with [...]
finished run
number of examples per pass = 43940
passes used = 1
weighted example sum = 43940.000000
weighted label sum = 6562.000000
average loss = 29.772555
best constant = 0.149340
best constant's loss = 0.977698
total feature number = 439376
Call:
roc.default(response = labels, predictor = probs, auc = TRUE, print.auc = TRUE, print.thres = TRUE)
Data: probs in 18689 controls (labels -1) < 25251 cases (labels 1).
Area under the curve: 0.996
Model Parameters
/usr/bin/vw -d X_train.vw --loss_function logistic -f mdl.vw --learning_rate=0.5 --passes 1 -c -b 25 --nn 10
AUC: 0.996387
edd@rob:~/git/rvw/demo(master)$ ie I just don't see the error you are seeing. |
Sorry for the long wait I've been testing different versions of Vowpal Wabbit with I used this arguments for training and predicting:
Similar arguments are used in For versions: 7.10, 8.0, 8.1.1, 8.2.1, 8.3.1
Unfortunately I didn't manage to build 8.4.0 version. The latest version 8.5.0 gives me following incorrect result:
I suppose this only happens on MacOS systems, but I will test such behavior on other OS. |
Thanks for your patience with that; rebuilding those versions is work! I had just jumped to the (Debian package sources) of 8.5.0 from here and I then built a local package. It looks like that was a full 8.5.0 release so the difference to yours may indeed be macOS vs Linux. Strange. Did you peek into the vw mailing list? And/or would you have a chance to test on another OS? |
I have not yet contacted people from vw mailing list, but I am planning to do it today. |
I was even thinking just about lurking on the list / looking where the issue had come up. A cross-check on Windows or Linux should be useful. |
I think I found a bug in external
vw
code.Sometimes shorter predictions file is generated when neural network model is used (specified by
--nn
argument) on MacOS 10.13.1 and probably other platforms.This causes an error in
roc
function frompROC
package and can be seen invw_example.R
demo file.While there is no solution for such behavior right now it may be useful to add some bypass solution.