Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explore_feature_map() function feedback #60

Closed
3 tasks done
pokrovskyy opened this issue Mar 26, 2020 · 2 comments
Closed
3 tasks done

explore_feature_map() function feedback #60

pokrovskyy opened this issue Mar 26, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@pokrovskyy
Copy link
Collaborator

pokrovskyy commented Mar 26, 2020

clsu22

  • In addition, wrong input type for the dataframe argument should be a TypeError instead of a ValueError.
  • This function is useful for numeric variables but seems to do nothing with categorical variables. I think you should clarify this in your function description. What's my suggestion is that you could also include categorical variables and use ANOVA test statistics or p-value to show the correlation between numeric variables and categorical variables. Also could do the chi-square test to find the correlation between two categorical variables. See my comment below

Keanna-K (for R, transitioned to Python)

  • Consider adding an option for handling data frames that contain NA values. Currently, if this function is performed on a data frame that has NA values, the function still runs without throwing any errors or warnings but the pairwise Pearson Correlation is not shown for the corresponding columns.
@pokrovskyy pokrovskyy self-assigned this Mar 26, 2020
@pokrovskyy pokrovskyy added this to the milestone4 milestone Mar 26, 2020
@pokrovskyy pokrovskyy added the enhancement New feature or request label Mar 26, 2020
@pokrovskyy
Copy link
Collaborator Author

pokrovskyy commented Mar 26, 2020

On categorical data:

The categorical data visualization is complex as it is much dependent on the type of categorical data. How do you define if it is not just some textual data? Is it sequential?

Thus I believe the best solution would be to keep this function as is for now and let user partition their data at their discretion, and then run pairwise feature correlation / plot on each partition individually

@pokrovskyy
Copy link
Collaborator Author

Moved discussion on categorical features plotting to here #67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant