Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance+memory] Beating git in index-pack (as used for clones and fetches) βœ… πŸš€ #5

Closed
Byron opened this issue Aug 4, 2020 · 3 comments

Comments

@Byron
Copy link
Member

Byron commented Aug 4, 2020

git index-pack is streaming a pack and creates an index from it. The difficulty arises from having to decompress every entry in the pack stream, which can be composed of many small objects. These are placed in some sort of index to accelerate the next stage that is all about resolving the deltas in order to produce a SHA1. Per pack entry, the SHA1, pack offset and CRC32 are written into the index file to complete the operation.

The indexing phase in inherently single-threaded with little potential for improvements, whereas the resolving phase is fully multithreaded and entirely lock free. The first phase could be improved by writing the pack file in parallel - right now it happens after reading it (the pack file is used later for lookup to not hold everything in memory). However, IO doesn't appear to be the bottleneck at all.

Compared to gitoxide, git is considerably faster when creating the index, averaging 54MB/s of reading uncompressed bytes. gitoxide clocks in at about 45MB/s 50MB/s, and slows down considerably during the end. Part of that slowdown might be attributed to this issue with resetting miniz_oxide's decompressor.

Luckily gitoxide is way faster when resolving deltas, which already gives it a good first place in the race, with some room for more if it manages to get as fast as git when decompressing and indexing objects.

The picture below shows the fastest git run I could produce, probably with everything being properly cached:

Screenshot 2020-08-04 at 12 04 36

Without cache, it seems to look different:

Screenshot 2020-08-04 at 12 04 36

The fastest gitoxide runs, which are pretty comparable in the amount of work done, as they also write out the pack and the index. The only difference is that they use the packfile directly instead of reading it from stdin, it's streamed nonetheless though, and merely an oversight.

Screenshot 2020-08-04 at 12 28 45

Memory consumption of git hovers consistently around 650MB (for the kernel pack), and is lower higher than the 1.2GB 750MB 580MB that gitoxide uses. However, gitoxide can temporarily use more memory as it keeps intermediate decompressed objects per thread, whose maximum sizes depend on the amount of children and the base size. So I have seen this go up to 850MB for small fractions of time because of that.

@Byron
Copy link
Member Author

Byron commented Aug 4, 2020

memory mode: in-memory, resolve-bases, resolve-deltas, resolve-deltas-and-bases

More like a test I wanted to see if it makes any difference to keep the decompressed data in memory to speed up downstream operations.

And it looks like this is actually reducing performance at least while the pack is also streamed to disk at the same time. The virtual memory system probably caches it entirely.

Screenshot 2020-08-04 at 12 42 52

When not streaming the pack to disk, in-memory operation appears to be yielding a mild speedup. But when allowing to write a temporary file, the speedup is entirely gone. Thus it seems that keeping decompressed bytes really doesn't do any good.

Screenshot 2020-08-04 at 12 51 15

➜  grit-rs git:(main) βœ— time ./target/release/gixp -t 4 index-from-pack  --verbose -i verify -m in-memory out < tests/fixtures/repos/rust.git/objects/pack/pack-af2bf75bff7478224bd4242e5106241ab6026232.pack
 04:47:38 indexing done 1266143 objects in 14.51s (87286.266 objects/s)
 04:47:48 Resolving done 1266143 objects in 9.76s (129665.88 objects/s)
 04:47:49 writing index file done 35453056 bytes in 0.13s (262973120 bytes/s)
 04:47:49  create index file done 1266143 objects in 24.97s (50702.586 objects/s)
./target/release/gixp -t 4 index-from-pack --verbose -i verify -m in-memory    46.92s user 4.46s system 205% cpu 24.992 total
➜  grit-rs git:(main) βœ— time ./target/release/gixp -t 4 index-from-pack  --verbose -i verify -m in-memory < tests/fixtures/repos/rust.git/objects/pack/pack-af2bf75bff7478224bd4242e5106241ab6026232.pack
 04:48:29 indexing done 1266143 objects in 13.19s (95986.98 objects/s)
 04:48:38 Resolving done 1266143 objects in 9.49s (133432.3 objects/s)
 04:48:39 writing index file done 35453056 bytes in 0.08s (433581380 bytes/s)
 04:48:39  create index file done 1266143 objects in 23.34s (54253.984 objects/s)
./target/release/gixp -t 4 index-from-pack --verbose -i verify -m in-memory <  47.60s user 2.85s system 216% cpu 23.354 total
➜  grit-rs git:(main) βœ— time ./target/release/gixp -t 4 index-from-pack  --verbose -i verify < tests/fixtures/repos/rust.git/objects/pack/pack-af2bf75bff7478224bd4242e5106241ab6026232.pack             04:50:53 indexing done 1266143 objects in 13.64s (92843.51 objects/s)
 04:51:02 Resolving done 1266143 objects in 9.32s (135791.89 objects/s)
 04:51:03 writing index file done 35453056 bytes in 0.08s (429502050 bytes/s)
 04:51:03  create index file done 1266143 objects in 23.64s (53559.31 objects/s)
./target/release/gixp -t 4 index-from-pack --verbose -i verify <   46.50s user 3.01s system 208% cpu 23.693 total
➜  grit-rs git:(main) βœ—

@Byron Byron pinned this issue Aug 4, 2020
@Byron Byron changed the title [performance] Beating git in index-pack (as used for clones and fetches) [performance+memory] βœ… πŸš€ Beating git in index-pack (as used for clones and fetches) Aug 5, 2020
@Byron Byron changed the title [performance+memory] βœ… πŸš€ Beating git in index-pack (as used for clones and fetches) [performance+memory] Beating git in index-pack (as used for clones and fetches) βœ… πŸš€ Aug 5, 2020
@Byron
Copy link
Member Author

Byron commented Aug 13, 2020

For the actual performance tests on a 96 core machine, have a look at this comment. tldr;: the time is dominated by creating an index by streaming the pack, and pack resolution is then done in about 10 seconds or 14.6GB/s (of decoded objects).

@Byron Byron closed this as completed Aug 13, 2020
@Byron
Copy link
Member Author

Byron commented Dec 6, 2020

The ARM git provided with MacOS Big Sur changes everything:

With 3 threads (default)

➜  gitoxide git:(main) time git index-pack --stdin -v < ./tests/fixtures/repos/linux.git/objects/pack/pack-3ee05b0f4e4c2cb59757c95c68e2d13c0a491289.pack
Receiving objects: 100% (7600359/7600359), 1.26 GiB | 82.23 MiB/s, done.
Resolving deltas: 100% (6396700/6396700), done.
pack    3ee05b0f4e4c2cb59757c95c68e2d13c0a491289
git index-pack --stdin -v <   75.78s user 5.78s system 199% cpu 40.845 total

Git is at least twice as fast when reading/streaming the pack. In our case this is limited by the deflate performance of millions of small streams, and there are still some improvements that we can make use of.

With 8 threads (as available cores)

➜  gitoxide git:(main) time git -c pack.threads=8 index-pack --stdin -v < ./tests/fixtures/repos/linux.git/objects/pack/pack-3ee05b0f4e4c2cb59757c95c68e2d13c0a491289.pack
Receiving objects: 100% (7600359/7600359), 1.26 GiB | 49.73 MiB/s, done.
Resolving deltas: 100% (6396700/6396700), done.
pack    3ee05b0f4e4c2cb59757c95c68e2d13c0a491289
git -c pack.threads=8 index-pack --stdin -v <   129.59s user 29.72s system 174% cpu 1:31.07 total

Clearly contention reduces speed. This effect is not visible at all when verifying pack entries, making me think the amount of work git does is entirely different (by now).

gitoxide looses whenever many small objects need to be decompressed, which is clearly visible in the later stages of the decoding stage. It also uses more CPU time that is disproportional to the speed gain.

➜  gitoxide git:(main) time ./target/release/gixp --verbose pack-index-from-data -p tests/fixtures/repos/linux.git/objects/pack/pack-3ee05b0f4e4c2cb59757c95c68e2d13c0a491289.pack out
 16:19:30 read pack done 1.4GB in 29.48s (46.2MB/s)
 16:19:30  indexing done 7.6M objects in 29.48s (257.8k objects/s)
 16:19:30 decompressing done 2.7GB in 29.48s (92.2MB/s)
 16:19:57     Resolving done 7.6M objects in 26.78s (283.9k objects/s)
 16:19:57      Decoding done 95.6GB in 26.78s (3.6GB/s)
index: 3fe49647a452e0f7ee6a857fb65ee243f6e38bb3==============================================================================================================>-----------------------------------------]
pack: 3ee05b0f4e4c2cb59757c95c68e2d13c0a491289=======================================================================>---------------------------------------------------------------------------------]
 16:19:59 writing index file done 212.8MB in 0.42s (506.2MB/s)
 16:19:59  create index file done 7.6M objects in 57.86s (131.4k objects/s)
./target/release/gixp --verbose pack-index-from-data -p  out  226.00s user 3.44s system 396% cpu 57.911 total

@GitoxideLabs GitoxideLabs locked and limited conversation to collaborators Feb 26, 2021
@Byron Byron unpinned this issue Oct 7, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant