-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain breakdown quirks in the UI for Funnels #5427
Comments
cc: @alexkim205 @paolodamico @liyiy I think this is an important little detail to get right. |
@marcushyett-ph @neilkakkar Testing this out this morning, this feels like a positive step forward but I feel it still misses the mark. One of our funnels without a breakdown: With a breakdown limit of 50: It would be really great if the outliers (outside of the breakdown limit) were summed into an "others" grouping rather than completely being omitted. |
I agree with @jredl-va that the most intuitive and helpful thing to do here would be to roll up all small/additional values into an "Other" category. That would take care of (2) & as noted (3) is a non-issue. Wdyt @neilkakkar ? Re (1), can we maybe get the baseline number from the backend and then maybe show this warning when appropriate? We could do something like what we have for identical sequential funnel steps. |
Some interesting analysis on this: https://metabase.posthog.net/question/112 I found a few of the slowest breakdown queries that we run (thanks @macobo !) - and tweaked these queries to see how well we do if (1) we explicitly query all breakdown values. (2) we group remaining breakdown values into an "other" category. Overall impressions: All three kind of queries come really close to each other, except in a few cases. Note: It's hard to do accurate timing analysis, so use this as only a rough guide. Also: these queries use a clever-ish optimisation such that we don't need to know all the breakdown values to figure out what should be in "Other" grouping. When total props are within limit (set 2 in metabase)Grouping was slightly faster than without(?!): I'd expect them to be around the same, since there's no extra work.
When total props are around limit (set 3 in metabase)The slowdown on limit 5 is larger than I'd expect (~10% - see why in the following analysis), but with our usual limit of 10, negligible to have that extra grouping.
When total props are much larger than limit (set 4 is metabase)Now, we're dealing with over 300 prop values. Grouping 290 of these into "Other" has a ~20% slowdown vs no grouping, while listing all breakdown values has a ~35% slowdown.
ConclusionGiven that these are our worst performing queries, and the performance impact with huugee number of breakdown values is bounded above (faster than querying all breakdown values), grouping remaining values into "Other" makes sense to me. It lets us keep data consistent, specially where inconsistencies would be blatantly obvious (skipping the long tail of breakdown values), without much affecting the usual flow of queries (making those consistent at the same time!) Thoughts? Head over to the metabase question to reproduce the full analysis, notice holes, etc. |
@neilkakkar slightly off topic, but do you still have the 3 queries you used? If so, mind adding them to https://github.com/PostHog/scratchpad/, would love to run some flamegraphs/measurements. :) |
All of them are here: https://metabase.posthog.net/question/112 - but I'll add a few representative queries there :) |
(2) seems to be fixed: https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22action%20created%22%2C%22name%22%3A%22action%20created%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%5D&actions=%5B%5D&funnel_viz_type=steps&display=FunnelViz&interval=day&new_entity=%5B%5D&date_from=-1d&date_to=dStart&breakdown=%24geoip_city_name&breakdown_type=person - example of small count with tonnes of breakdown values remaining consistent. But I see (1) a lot more frequently now (which is probably how it should be?) If you check the persons for the counts, you'd see they are lower than what the value says. Explaining why there are fewer persons than the count is important here I think. Example: Click on Chrome iOS -> says 14 people -> but returns 13 people. Even if we remove the count (#5560 ), if the number is small enough, people will see the discrepancy, so it's worth clarifying it. We could just show it when the breakdown numbers don't match? |
This issue hasn't seen activity in two years! If you want to keep it open, post a comment or remove the |
This issue was closed due to lack of activity. Feel free to reopen if it's still relevant. |
Bug description
There's 2 weird cases that can happen with breakdowns:
(1) On breakdown, the count increases
This happens when a person completes a funnel using multiple properties (ex: breakdown by browser, person completes funnel in Chrome and Firefox implies the count doubles)
(2) On breakdown, the count decreases
This happens because we have a limit on the number of breakdown properties to display. These properties are given in decreasing order of size, so smaller breakdown values might not show up, thus showing a slight discrepancy in numbers with & without breakdown
(3) Not-weird: On breakdown, the count remains the same
This happens when neither of the above conditions are met (breakdown values count is less than 10, and no person completes/enters the funnel using different breakdown values)
Expected
It would be nice to explain to users when there are inconsistencies, and why.
For more context: #5341
The text was updated successfully, but these errors were encountered: