You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, first of all thank you for this awesome package and the Medium article ;)
I am testing the associations function found in nominal.py with a mix of numerical and categorical variables. I provide below a sample (sample.csv) of the dataset that is returning an empty result.
data = pd.read_csv(sample.csv', header = None)
associations(data)
The numerical columns are providing results that are fine but I am not getting anything for the categorical ones, my result is :
How is that nothing is returned ?
When testing this with other datasets that have a mix of variables I have had the case were everything was calculated just fine, cases where it doesn't, like the above, and cases where it does not and it throws this warning: RuntimeWarning: divide by zero encountered in double_scalars return np.sqrt(phi2corr / min((kcorr - 1), (rcorr - 1)))
Any help will be greatly appreciated, I can't wait to use this package more!
The text was updated successfully, but these errors were encountered:
There's a bug in the code that this specific data triggers from some reason - has to do with the pd.crosstab part of cramers_v. I'll try fixing it soon - in the meantime you can use theil_u=True in the associations method.
The problem with column 2 is that it has only a single value in it (at least in this example). There's an underlying assumption which is that there are at least two distinct values in each column. I'll add an option to ignore single-value columns, and perhaps print a more clear warning
So there's a rare edge case here, where the bias correction of Cramer's V ends up with a denominator of 0. I added an option to disable the bias correction in version 0.5.0. This should prevent errors like these.
In the new version, the plotted heat map will look like this:
Along with a clear warning:
RuntimeWarning: Unable to calculate Cramer's V using bias correction. Consider trying using bias_correction=False
Hi, first of all thank you for this awesome package and the Medium article ;)
I am testing the associations function found in
nominal.py
with a mix of numerical and categorical variables. I provide below a sample (sample.csv
) of the dataset that is returning an empty result.My code is:
The numerical columns are providing results that are fine but I am not getting anything for the categorical ones, my result is :
How is that nothing is returned ?
When testing this with other datasets that have a mix of variables I have had the case were everything was calculated just fine, cases where it doesn't, like the above, and cases where it does not and it throws this warning:
RuntimeWarning: divide by zero encountered in double_scalars return np.sqrt(phi2corr / min((kcorr - 1), (rcorr - 1)))
Any help will be greatly appreciated, I can't wait to use this package more!
The text was updated successfully, but these errors were encountered: