Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain breakdown quirks in the UI for Funnels #5427

Closed
neilkakkar opened this issue Aug 3, 2021 · 10 comments · Fixed by #5538
Closed

Explain breakdown quirks in the UI for Funnels #5427

neilkakkar opened this issue Aug 3, 2021 · 10 comments · Fixed by #5538
Labels
bug Something isn't working right feature/funnels Feature Tag: Funnels stale

Comments

@neilkakkar
Copy link
Contributor

Bug description

There's 2 weird cases that can happen with breakdowns:

(1) On breakdown, the count increases

This happens when a person completes a funnel using multiple properties (ex: breakdown by browser, person completes funnel in Chrome and Firefox implies the count doubles)

(2) On breakdown, the count decreases

This happens because we have a limit on the number of breakdown properties to display. These properties are given in decreasing order of size, so smaller breakdown values might not show up, thus showing a slight discrepancy in numbers with & without breakdown

(3) Not-weird: On breakdown, the count remains the same

This happens when neither of the above conditions are met (breakdown values count is less than 10, and no person completes/enters the funnel using different breakdown values)

Expected

It would be nice to explain to users when there are inconsistencies, and why.

For more context: #5341

@neilkakkar neilkakkar added the bug Something isn't working right label Aug 3, 2021
@macobo macobo added feature/funnels Feature Tag: Funnels UI/UX labels Aug 3, 2021
@marcushyett-ph
Copy link
Contributor

cc: @alexkim205 @paolodamico @liyiy

I think this is an important little detail to get right.

@jredl-va
Copy link
Contributor

jredl-va commented Aug 3, 2021

@marcushyett-ph @neilkakkar Testing this out this morning, this feels like a positive step forward but I feel it still misses the mark.

One of our funnels without a breakdown:

image

With a breakdown limit of 50:

image

It would be really great if the outliers (outside of the breakdown limit) were summed into an "others" grouping rather than completely being omitted.

@paolodamico
Copy link
Contributor

I agree with @jredl-va that the most intuitive and helpful thing to do here would be to roll up all small/additional values into an "Other" category. That would take care of (2) & as noted (3) is a non-issue. Wdyt @neilkakkar ?

Re (1), can we maybe get the baseline number from the backend and then maybe show this warning when appropriate? We could do something like what we have for identical sequential funnel steps.

@neilkakkar
Copy link
Contributor Author

Hmm originally we opted to do it this way because we weren't sure of the performance implications. The idea makes sense to me, I'll test some queries and figure out if there are slowdowns/issues with it, before making a call here. (cc: @macobo , @EDsCODE )

@neilkakkar
Copy link
Contributor Author

neilkakkar commented Aug 9, 2021

Some interesting analysis on this: https://metabase.posthog.net/question/112

I found a few of the slowest breakdown queries that we run (thanks @macobo !) - and tweaked these queries to see how well we do if (1) we explicitly query all breakdown values. (2) we group remaining breakdown values into an "other" category.

Overall impressions: All three kind of queries come really close to each other, except in a few cases.

Note: It's hard to do accurate timing analysis, so use this as only a rough guide. Also: these queries use a clever-ish optimisation such that we don't need to know all the breakdown values to figure out what should be in "Other" grouping.

When total props are within limit (set 2 in metabase)

Grouping was slightly faster than without(?!): I'd expect them to be around the same, since there's no extra work.

Req. Timing (ms). Rows
original 4,789 3
grouping with same exclusion filters as original 4,600 3
grouping with other 4,318 4

When total props are around limit (set 3 in metabase)

The slowdown on limit 5 is larger than I'd expect (~10% - see why in the following analysis), but with our usual limit of 10, negligible to have that extra grouping.

Req. Timing (ms). Rows
original (limit 5) 5,650 5
grouping with original (limit 5) 6,210 6
original (limit 10) 6,151 10
grouping with original (limit 10) 6,148 11

When total props are much larger than limit (set 4 is metabase)

Now, we're dealing with over 300 prop values. Grouping 290 of these into "Other" has a ~20% slowdown vs no grouping, while listing all breakdown values has a ~35% slowdown.

Req. Timing (ms). Rows
original (limit 10) 2,449 10
all breakdown values 3,274 300
grouping with original (limit 10) 2,873 11

Conclusion

Given that these are our worst performing queries, and the performance impact with huugee number of breakdown values is bounded above (faster than querying all breakdown values), grouping remaining values into "Other" makes sense to me.

It lets us keep data consistent, specially where inconsistencies would be blatantly obvious (skipping the long tail of breakdown values), without much affecting the usual flow of queries (making those consistent at the same time!)


Thoughts? Head over to the metabase question to reproduce the full analysis, notice holes, etc.

@macobo
Copy link
Contributor

macobo commented Aug 10, 2021

@neilkakkar slightly off topic, but do you still have the 3 queries you used? If so, mind adding them to https://github.com/PostHog/scratchpad/, would love to run some flamegraphs/measurements. :)

@neilkakkar
Copy link
Contributor Author

All of them are here: https://metabase.posthog.net/question/112 - but I'll add a few representative queries there :)

@neilkakkar
Copy link
Contributor Author

neilkakkar commented Aug 12, 2021

(2) seems to be fixed: https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22action%20created%22%2C%22name%22%3A%22action%20created%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%5D&actions=%5B%5D&funnel_viz_type=steps&display=FunnelViz&interval=day&new_entity=%5B%5D&date_from=-1d&date_to=dStart&breakdown=%24geoip_city_name&breakdown_type=person - example of small count with tonnes of breakdown values remaining consistent.

But I see (1) a lot more frequently now (which is probably how it should be?)

Example: https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22%24autocapture%22%2C%22name%22%3A%22%24autocapture%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%5D&actions=%5B%5D&funnel_viz_type=steps&display=FunnelViz&interval=day&new_entity=%5B%5D&date_from=-1d&date_to=dStart&breakdown=%24browser&breakdown_type=event

If you check the persons for the counts, you'd see they are lower than what the value says. Explaining why there are fewer persons than the count is important here I think. Example: Click on Chrome iOS -> says 14 people -> but returns 13 people. Even if we remove the count (#5560 ), if the number is small enough, people will see the discrepancy, so it's worth clarifying it. We could just show it when the breakdown numbers don't match?

@paolodamico paolodamico removed the UI/UX label Feb 22, 2022
@posthog-bot
Copy link
Contributor

This issue hasn't seen activity in two years! If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

@posthog-bot
Copy link
Contributor

This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.

@posthog-bot posthog-bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right feature/funnels Feature Tag: Funnels stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants