Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from 'cut' to 'head'; cut isn't doing what it should here #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

compilingEntropy
Copy link

Based on the comment on line 66-67, the cut command is supposed to just grab the first 1024 bytes of the file and pass that to awk. Instead, the cut does... something else. I didn't bother figuring out what, but here's some tests that paint the picture:

$ time cut -b -1024 ./Downloads/World\ War\ Hulk.pdf | wc -c
 403378749
cut -b -1024 ./Downloads/World\ War\ Hulk.pdf  32.49s user 0.28s system 99% cpu 32.899 total
wc -c  3.28s user 0.19s system 10% cpu 32.898 total
$ time cat ./Downloads/World\ War\ Hulk.pdf | wc -c
 421636877
cat ./Downloads/World\ War\ Hulk.pdf  0.02s user 0.21s system 6% cpu 3.516 total
wc -c  3.44s user 0.07s system 99% cpu 3.515 total
$ time head -c 1024 ./Downloads/World\ War\ Hulk.pdf | wc -c
    1024
head -c 1024 ./Downloads/World\ War\ Hulk.pdf  0.00s user 0.00s system 71% cpu 0.004 total
wc -c  0.00s user 0.00s system 67% cpu 0.004 total
  • The cut command chops out some of the file, sure, but not the intended portion (>95% of the file remains) and for my (large) file, it takes 33 seconds to do so.
  • A simple cat in the second command is 10x faster than the cut, completing in 3.5 seconds and containing the entire file.
  • If you use head instead, you get the intended first 1024 bytes and the command completes effectively instantly.

This commit moves away from cut to the head command, which accomplishes the intended file slice and improves performance (particularly on large files).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant