You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Suggested by clsu22 regarding the explore_feature_map() function:
This function is useful for numeric variables but seems to do nothing with categorical variables. I think you should clarify this in your function description. What's my suggestion is that you could also include categorical variables and use ANOVA test statistics or p-value to show the correlation between numeric variables and categorical variables. Also could do the chi-square test to find the correlation between two categorical variables.
The categorical data visualization is complex as it is much dependent on the type of categorical data. How do you define if it is not just some textual data? Is it sequential?
For now, I believe the best solution would be to keep this function as is and let the end user partition their data at their discretion. Then they could run pairwise feature correlation / plot on each partition individually.
The text was updated successfully, but these errors were encountered:
One idea could be to designate a list of categorical features (via function arguments) and then return an array of plots for each level / combination of levels.
One problem with that is that if there are many levels / combinations, it can take considerable amount of time to run.
Another thing is, this can be easily done by user at their discretion (looping through their subset of levels etc.) This gives even more flexibility to the user as compared to just brute-force-plotting through all the levels.
Suggested by clsu22 regarding the
explore_feature_map()
function:The categorical data visualization is complex as it is much dependent on the type of categorical data. How do you define if it is not just some textual data? Is it sequential?
For now, I believe the best solution would be to keep this function as is and let the end user partition their data at their discretion. Then they could run pairwise feature correlation / plot on each partition individually.
The text was updated successfully, but these errors were encountered: