You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the variants percentage filter (variants_filter.filter_log_variants_percentage) the filter doesn't actually scale linearly.
When using the following code I get the following results with my dataset:
variants_filter.filter_log_variants_percentage(log, percentage=1) # Length is 4617variants_filter.filter_log_variants_percentage(log, percentage=.1) # Length is still 4617 variants_filter.filter_log_variants_percentage(log, percentage=.05) # Length is 241
The text was updated successfully, but these errors were encountered:
Dear Selene Codes, the variants filter on percentage works as follow, given the percentage P:
The variants of the log are found along with their number of occurrences
A number N is chosen such that if we take all the variants with at least N occurrences, we include a percentage of cases that is at least P, while if we choose N+1 we would include a percentage of cases that is below P.
If the log contains the following variants:
ABC (2 occurrences)
A,B,C,AB,AC,BC,CB,BA,CA,ABCD,ABCE,ABCF,ABCG,ABCH,ABCI,ABCL,ABCM,ABCN (1 occurrence each)
Then with percentage=1, all the 20 cases would have been keep.
If we choose percentage=0.1, and N=1, then we include all the cases, while choosing N=2 we include only the cases of the first variant (that are the 5% of the log, hence N=2 is not valid according to the above principle).
If we choose percentage=0.05, and N=2, then we include exactly 5% of the cases of the log, that is the minimum requirement.
…eir-visualization-in-the-performance-dfg' into 'integration'
[Priority 2] Support for the computation of sojourn times and their visualization in the performance DFG
Closes#179
See merge request process-mining/pm4py/pm4py-core!1163
When using the variants percentage filter (
variants_filter.filter_log_variants_percentage
) the filter doesn't actually scale linearly.When using the following code I get the following results with my dataset:
The text was updated successfully, but these errors were encountered: