Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listing a Directory with Many files takes a long time #182

Closed
jag3773 opened this issue May 18, 2017 · 4 comments
Closed

Listing a Directory with Many files takes a long time #182

jag3773 opened this issue May 18, 2017 · 4 comments
Labels

Comments

@jag3773
Copy link
Contributor

jag3773 commented May 18, 2017

Story

As a user I don't have to have to wait a long time for a file listing when I click on a directory in my repository so that I don't waste time or get frustrated.

Notes

Opening the https://git.door43.org/Door43/en_tw/src/master/bible/other takes a while.

@ethantkoenig
Copy link
Contributor

Relevant upstream issue: go-gitea/gitea#502

From poking around, the bottleneck seems to be with looking up the latest commit for each file in the directory; a separate git log -1 <filename> command is run for each file. These commands are already run in parallel, and I can't find an alternative way to look up the latest commits in batches.

@ethantkoenig
Copy link
Contributor

Actually, I think I have a way to speed things up; it'll require making a change to an upstream dependency (https://github.com/go-gitea/git).

@jag3773
Copy link
Contributor Author

jag3773 commented May 19, 2017

This same bottleneck is why we couldn't use the elastic file system at AWS, but had to go with an EBS volume. I thought storing this data in the database might help.

What did you have in mind for a fix?

@ethantkoenig
Copy link
Contributor

ethantkoenig commented May 19, 2017

The general idea is to get all commit hashes affecting any of the directory's entries (git log path/to/dir), then for each commit hash list the affected files (git ls-tree <hash>). We start with the latest commit, and go backwards until we've found a commit for each entry.

A proof-of-concept test I ran locally showed a ~20x improvement (15 vs 0.7 seconds) for the example repo.

@benjore benjore added this to the WST Sprint #27 milestone Jun 1, 2017
@bspidel bspidel self-assigned this Jun 2, 2017
@richmahn richmahn reopened this Jun 5, 2017
@jag3773 jag3773 closed this as completed Jun 15, 2017
This was referenced Jun 26, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants