Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update log output wording "Removing from" #20

Open
Shooter3k opened this issue Jan 27, 2021 · 4 comments
Open

update log output wording "Removing from" #20

Shooter3k opened this issue Jan 27, 2021 · 4 comments

Comments

@Shooter3k
Copy link

Can you change this message "Removing from" to something like "Removing data from source file found in" or something similar to that?

whenever I read "Removing from", I think it's removing data from that file

Reading "l:\hashes.txt"...6632263536 bytes total in 13.1248 seconds
Counting lines...Found 486862874 lines in 9.7545 seconds
Optimal HashPrime is 805306457
Estimated memory required: 20,969,378,808 (19.53Gbytes)
Processing input list... 486862873 unique (1 duplicate lines) in 184.6410 seconds
Occupancy is 365354075/805306457 45.3683%, Maxdepth=9
Removing from "l:\hashes.2.txt"... 483376919 removed
Removing from "l:\hashes.3.txt"... 49710 removed

@roycewilliams
Copy link
Contributor

Yeah, we debated that back and forth a bit. Since paths can be long, we were also trying to be brief.

Hmm, how about just dropping them 'from' ?

Removing "l:\hashes.2.txt"... 483376919 removed
Removing "l:\hashes.3.txt"... 49710 removed

In context, I hope that users would know that we didn't mean removing the file ... hmm.

@Shooter3k
Copy link
Author

Seems like the issue is the way you're listing the files you are using to clean the source with.

how about:
Comparing "l:\hashes.2.txt"... 483376919 lines removed from source
Comparing "l:\hashes.3.txt"... 49710 lines removed from source

@roycewilliams
Copy link
Contributor

That's sufficiently precise, but quite a bit wordier than previous. One the design goals was to maximize single-line "real estate" for filenames with long paths.

@roycewilliams
Copy link
Contributor

roycewilliams commented May 26, 2023

Later discussion suggests that a good solution to avoid the understandable "moment of panic" might be to just re-summarize what's about to happen, just prior to the removing phase, and eliminate the need to directly say "removing" at all, as in:

[...]

Removing records from "I:\hashes.txt" when found in other files ...
Reading "l:\hashes.2.txt"... 483376919 removed
Reading "l:\hashes.3.txt"... 49710 removed

Starting with "Reading" removes the potential panic (even if not fully accurate), and the "N removed" at the end provides sufficient context. If "Reading" is insufficiently precise, "Processing" would also work - but IMO the extra psychological assurance of "read-only-ness" of using "Reading" is worth the trade-off in accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants