Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ROC module #183

Open
Tracked by #438
ljchang opened this issue Nov 27, 2017 · 3 comments
Open
Tracked by #438

Refactor ROC module #183

ljchang opened this issue Nov 27, 2017 · 3 comments

Comments

@ljchang
Copy link
Member

ljchang commented Nov 27, 2017

ROC plot has been having a lot of problems. Right now forced choice accuracy doesn't seem to be always correct.

We should refactor this and write proper tests.

Also need to address balanced accuracy p-value at some point (try permutations)

@ljchang
Copy link
Member Author

ljchang commented Nov 27, 2017

forced choice test might be impacted by this commit 3bb8db3 by @ljchang

@ejolly ejolly changed the title Refactor ROC Plot Refactor ROC module Mar 16, 2021
@mpcoll
Copy link

mpcoll commented Apr 1, 2021

In case this is helpful for this, I noticed that the input type of the data silently gives different results for the same data (see example below). I think the input variables should be explicitly coerced into a specific type or raise an error if not of the expected type to avoid these issues.

I get different results for each of these examples:

from nltools.analysis import Roc
import numpy as np
import pandas as pd

inputs = np.array([1, 2, 1, 2, 2, 1, 1, 2])
outcomes = np.array([0, 1, 0, 1, 0, 1, 0, 1])
subs = np.array([1, 1, 2, 2, 3, 3, 4, 4])

# With int outcomes
roc = Roc(inputs, outcomes)
roc.calculate()
roc.summary()

# With numpy boolean outcomes
outcomes = outcomes.astype(bool)
roc = Roc(inputs, outcomes)
roc.calculate()
roc.summary()

# Forced choice
# With int inputs
roc = Roc(input_values=inputs,
          binary_outcome=outcomes,
          forced_choice=subs)
roc.calculate()
roc.summary()

# With float inputs
roc = Roc(input_values=inputs.astype(float),
          binary_outcome=outcomes,
          forced_choice=subs)
roc.calculate()
roc.summary()

# With pd Series outcomes
roc = Roc(input_values=inputs.astype(float),
          binary_outcome=pd.Series(outcomes.astype(bool)),
          forced_choice=subs)
roc.calculate()
roc.summary()

@ljchang
Copy link
Member Author

ljchang commented Apr 1, 2021

Thanks for this. We are planning to do a major refactor to this module soon as it is a mess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants