Discussion on plotting categorical data #67

pokrovskyy · 2020-03-27T00:56:45Z

Suggested by clsu22 regarding the explore_feature_map() function:

This function is useful for numeric variables but seems to do nothing with categorical variables. I think you should clarify this in your function description. What's my suggestion is that you could also include categorical variables and use ANOVA test statistics or p-value to show the correlation between numeric variables and categorical variables. Also could do the chi-square test to find the correlation between two categorical variables.

The categorical data visualization is complex as it is much dependent on the type of categorical data. How do you define if it is not just some textual data? Is it sequential?

For now, I believe the best solution would be to keep this function as is and let the end user partition their data at their discretion. Then they could run pairwise feature correlation / plot on each partition individually.

The text was updated successfully, but these errors were encountered:

pokrovskyy · 2020-03-27T01:00:43Z

One idea could be to designate a list of categorical features (via function arguments) and then return an array of plots for each level / combination of levels.

One problem with that is that if there are many levels / combinations, it can take considerable amount of time to run.

Another thing is, this can be easily done by user at their discretion (looping through their subset of levels etc.) This gives even more flexibility to the user as compared to just brute-force-plotting through all the levels.

Open for your ideas

pokrovskyy added the enhancement New feature or request label Mar 27, 2020

This was referenced Mar 27, 2020

explore_feature_map() function feedback #60

Closed

Submission: pyxplr (Python) UBC-MDS/software-review#34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion on plotting categorical data #67

Discussion on plotting categorical data #67

pokrovskyy commented Mar 27, 2020 •

edited

Loading

pokrovskyy commented Mar 27, 2020

Discussion on plotting categorical data #67

Discussion on plotting categorical data #67

Comments

pokrovskyy commented Mar 27, 2020 • edited Loading

pokrovskyy commented Mar 27, 2020

pokrovskyy commented Mar 27, 2020 •

edited

Loading