Explain breakdown quirks in the UI for Funnels #5427

neilkakkar · 2021-08-03T11:13:46Z

Bug description

There's 2 weird cases that can happen with breakdowns:

(1) On breakdown, the count increases

This happens when a person completes a funnel using multiple properties (ex: breakdown by browser, person completes funnel in Chrome and Firefox implies the count doubles)

(2) On breakdown, the count decreases

This happens because we have a limit on the number of breakdown properties to display. These properties are given in decreasing order of size, so smaller breakdown values might not show up, thus showing a slight discrepancy in numbers with & without breakdown

(3) Not-weird: On breakdown, the count remains the same

This happens when neither of the above conditions are met (breakdown values count is less than 10, and no person completes/enters the funnel using different breakdown values)

Expected

It would be nice to explain to users when there are inconsistencies, and why.

For more context: #5341

marcushyett-ph · 2021-08-03T12:32:53Z

cc: @alexkim205 @paolodamico @liyiy

I think this is an important little detail to get right.

jredl-va · 2021-08-03T14:39:18Z

@marcushyett-ph @neilkakkar Testing this out this morning, this feels like a positive step forward but I feel it still misses the mark.

One of our funnels without a breakdown:

With a breakdown limit of 50:

It would be really great if the outliers (outside of the breakdown limit) were summed into an "others" grouping rather than completely being omitted.

paolodamico · 2021-08-03T22:25:26Z

I agree with @jredl-va that the most intuitive and helpful thing to do here would be to roll up all small/additional values into an "Other" category. That would take care of (2) & as noted (3) is a non-issue. Wdyt @neilkakkar ?

Re (1), can we maybe get the baseline number from the backend and then maybe show this warning when appropriate? We could do something like what we have for identical sequential funnel steps.

neilkakkar · 2021-08-04T10:50:05Z

Hmm originally we opted to do it this way because we weren't sure of the performance implications. The idea makes sense to me, I'll test some queries and figure out if there are slowdowns/issues with it, before making a call here. (cc: @macobo , @EDsCODE )

neilkakkar · 2021-08-09T16:02:40Z

Some interesting analysis on this: https://metabase.posthog.net/question/112

I found a few of the slowest breakdown queries that we run (thanks @macobo !) - and tweaked these queries to see how well we do if (1) we explicitly query all breakdown values. (2) we group remaining breakdown values into an "other" category.

Overall impressions: All three kind of queries come really close to each other, except in a few cases.

Note: It's hard to do accurate timing analysis, so use this as only a rough guide. Also: these queries use a clever-ish optimisation such that we don't need to know all the breakdown values to figure out what should be in "Other" grouping.

When total props are within limit (set 2 in metabase)

Grouping was slightly faster than without(?!): I'd expect them to be around the same, since there's no extra work.

Req.	Timing (ms).	Rows
original	4,789	3
grouping with same exclusion filters as original	4,600	3
grouping with other	4,318	4

When total props are around limit (set 3 in metabase)

The slowdown on limit 5 is larger than I'd expect (~10% - see why in the following analysis), but with our usual limit of 10, negligible to have that extra grouping.

Req.	Timing (ms).	Rows
original (limit 5)	5,650	5
grouping with original (limit 5)	6,210	6
original (limit 10)	6,151	10
grouping with original (limit 10)	6,148	11

When total props are much larger than limit (set 4 is metabase)

Now, we're dealing with over 300 prop values. Grouping 290 of these into "Other" has a ~20% slowdown vs no grouping, while listing all breakdown values has a ~35% slowdown.

Req.	Timing (ms).	Rows
original (limit 10)	2,449	10
all breakdown values	3,274	300
grouping with original (limit 10)	2,873	11

Conclusion

Given that these are our worst performing queries, and the performance impact with huugee number of breakdown values is bounded above (faster than querying all breakdown values), grouping remaining values into "Other" makes sense to me.

It lets us keep data consistent, specially where inconsistencies would be blatantly obvious (skipping the long tail of breakdown values), without much affecting the usual flow of queries (making those consistent at the same time!)

Thoughts? Head over to the metabase question to reproduce the full analysis, notice holes, etc.

macobo · 2021-08-10T06:18:52Z

@neilkakkar slightly off topic, but do you still have the 3 queries you used? If so, mind adding them to https://github.com/PostHog/scratchpad/, would love to run some flamegraphs/measurements. :)

neilkakkar · 2021-08-10T08:23:54Z

All of them are here: https://metabase.posthog.net/question/112 - but I'll add a few representative queries there :)

neilkakkar · 2021-08-12T13:56:07Z

(2) seems to be fixed: https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22action%20created%22%2C%22name%22%3A%22action%20created%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%5D&actions=%5B%5D&funnel_viz_type=steps&display=FunnelViz&interval=day&new_entity=%5B%5D&date_from=-1d&date_to=dStart&breakdown=%24geoip_city_name&breakdown_type=person - example of small count with tonnes of breakdown values remaining consistent.

But I see (1) a lot more frequently now (which is probably how it should be?)

Example: https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22%24autocapture%22%2C%22name%22%3A%22%24autocapture%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%5D&actions=%5B%5D&funnel_viz_type=steps&display=FunnelViz&interval=day&new_entity=%5B%5D&date_from=-1d&date_to=dStart&breakdown=%24browser&breakdown_type=event

If you check the persons for the counts, you'd see they are lower than what the value says. Explaining why there are fewer persons than the count is important here I think. Example: Click on Chrome iOS -> says 14 people -> but returns 13 people. Even if we remove the count (#5560 ), if the number is small enough, people will see the discrepancy, so it's worth clarifying it. We could just show it when the breakdown numbers don't match?

posthog-bot · 2024-03-14T07:32:02Z

This issue hasn't seen activity in two years! If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

posthog-bot · 2024-03-29T07:31:15Z

This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.

neilkakkar added the bug Something isn't working right label Aug 3, 2021

neilkakkar mentioned this issue Aug 3, 2021

Inconsistency between breakdown and roll-up of funnel #5341

Closed

2 tasks

macobo added feature/funnels Feature Tag: Funnels UI/UX labels Aug 3, 2021

neilkakkar mentioned this issue Aug 11, 2021

Group reamining breakdown values into "Other" for funnels #5538

Merged

6 tasks

neilkakkar closed this as completed in #5538 Aug 12, 2021

neilkakkar reopened this Aug 12, 2021

paolodamico removed the UI/UX label Feb 22, 2022

posthog-bot added the stale label Mar 14, 2024

posthog-bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain breakdown quirks in the UI for Funnels #5427

Explain breakdown quirks in the UI for Funnels #5427

neilkakkar commented Aug 3, 2021

marcushyett-ph commented Aug 3, 2021

jredl-va commented Aug 3, 2021 •

edited

Loading

paolodamico commented Aug 3, 2021

neilkakkar commented Aug 4, 2021

neilkakkar commented Aug 9, 2021 •

edited

Loading

macobo commented Aug 10, 2021

neilkakkar commented Aug 10, 2021

neilkakkar commented Aug 12, 2021 •

edited

Loading

posthog-bot commented Mar 14, 2024

posthog-bot commented Mar 29, 2024

Explain breakdown quirks in the UI for Funnels #5427

Explain breakdown quirks in the UI for Funnels #5427

Comments

neilkakkar commented Aug 3, 2021

Bug description

Expected

marcushyett-ph commented Aug 3, 2021

jredl-va commented Aug 3, 2021 • edited Loading

paolodamico commented Aug 3, 2021

neilkakkar commented Aug 4, 2021

neilkakkar commented Aug 9, 2021 • edited Loading

When total props are within limit (set 2 in metabase)

When total props are around limit (set 3 in metabase)

When total props are much larger than limit (set 4 is metabase)

Conclusion

macobo commented Aug 10, 2021

neilkakkar commented Aug 10, 2021

neilkakkar commented Aug 12, 2021 • edited Loading

posthog-bot commented Mar 14, 2024

posthog-bot commented Mar 29, 2024

jredl-va commented Aug 3, 2021 •

edited

Loading

neilkakkar commented Aug 9, 2021 •

edited

Loading

neilkakkar commented Aug 12, 2021 •

edited

Loading