Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

plumbing: packfile/scanner, readability/performance improvements, zlib pooling #1124

Merged
merged 1 commit into from
Apr 23, 2019

Conversation

saracen
Copy link
Contributor

@saracen saracen commented Apr 21, 2019

This migrates scanner's trackableReader, byteReadSeeker and teeReader's functionality into one scannerReader. I wanted the ability to have a pool of Scanners if need be, and offering a Reset() and dealing with all 3 readers was proving difficult. I believe this version is also easier to read.

In addition:

  • There's a zlib pool. Pooling zlib isn't obvious, as although it has a Reset() function, zlib.NewReader doesn't accept nil. This limitation is by-passed by priming the reader with a zero byte compressed payload. I couldn't find a better way of doing this.
  • Previously, writing to the crc hash still often performed small writes (via Read(), not just ReadByte()). Now all small reads are buffered when they need to be. The buffer size is now smaller and sized to take advantage of the CLMUL instruction crc32 uses.
  • Deprecated scanner's Flush(). There's no need for this to be public.

@filipnavara
Copy link
Contributor

I have seen some woefully unoptimal code paths in that code when I was recently profiling Gitea. I'd gladly give it a try once it passes the CI tests. It may be a better alternative to #1122.

crc: crc,
IsSeekable: ok,
}
}

func (s *Scanner) Reset(r io.Reader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to hace a test that reuses scanner with Reset. Other than that LGTM! Thanks!

Copy link
Contributor Author

@saracen saracen Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good shout! I noticed a problem whilst figuring out how/what to test. Even when the reader is seekable, I'm returning the r.offset when whence == io.SeekCurrent && offset == 0, that incorrectly assumes that the reader provided to start with was set at the beginning of the file.

I'll fix that and add a test shortly!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added two tests. TestReaderReset checks basics usage and that fields are correctly reset. TestReaderResetSeeks checks that the seek behavior is correct.

@saracen saracen force-pushed the packfile-scanner-improved-reader branch from a835515 to 9d4279f Compare April 22, 2019 21:30
@mcuadros mcuadros merged commit 54ff89f into src-d:master Apr 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants