-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow page loads with a large repo #491
Comments
I tested this on mac OS in a lowest power MacBook air, loading rails spent 3500ms ~ 4200ms. I think it's enough for v1.1. |
@lunny could you please give alpinelinux aports a try? Then try to browse the main directory. It takes couple of minutes on a powerful server mainly because of I remember this issue has also been reported on gogs before but was never taken care of. Some have suggested to use a caching system. A simpler approach would be to fetch a directory list (like github does) and if needed a-sync fetch commit messages via javascript. Cgit just only shows the directory list which is pretty fast (if possible add this as an option to disable fetching of commit messages, if thats possible with current implementation). |
I will try it. @clandmeter |
Which page do you want to test? @clandmeter In my machine, main page is 1763ms and first release page is 6662ms . |
@lunny can you check the main directory like this one at github: |
@lunny btw, im using 1.0.1 i believe the performance commits for tags page has landed after the 1.0.1, or in another branch. |
@clandmeter Yes, I tested in master. I think v1.0.1 maybe slower than master. Yes. it's related with #502 |
@lunny I tried master today both on Linux (Alpine Linux) and win10. Both crash at startup so i cannot verify if its faster. |
Where is the crash log? |
|
I am stopped by the same panic message as @clandmeter (I don't know if it is the same issue, I was trying to update my Gitea installation - running on Docker)
|
resolved by #708 |
@lunny seems master branch is working again so I did some small tests:
|
Yes. This issue should be fixed by #570 . |
move this to v1.2 since #570 has been moved. |
@lunny any progress in this area? Im still getting very slow loads on large directory contents: https://try.gitea.io/clandmeter/aports/src/branch/master/community Would it be possible to have a pager or disable the loading of commit history? |
For github it will only show the first 1000 files. |
Oh my @filipnavara Awesome job on the performance improvements. I pulled your |
#6364 was merged yesterday! 😄 |
I have a repo with a folder with more than 2000 files. This takes ~25 seconds to load (not production site), of which 24 seconds are spent in In addition to any performance improvements possible, maybe a new option could be introduced to display only file names (without latest commit info) if a folder contains more than x entries (folders and files). That way very big folders can still be shown quickly but if you want to see commit details/history you need to enter the specific file. |
@davidsvantesson You can speed it up a bit by building commit-graph file ( |
@filipnavara That is very interesting, but I do not see any change in performance for listing repo files in Gitea. Maybe Gitea doesn't run operations where it benefits from it? Edit: That is strange, because I have the code of #7314, but doesn't seem to improve my performance. I will do some more investigation into it. |
I think the problem is that It would be interesting to change to use something like |
Yes, that is the pathological case and there is no way around it unless you introduce some new cache or statistical structure (bloom filters) to speed this up. The algorithm in |
@filipnavara A simple command line git operation made it clear Gitea is already very efficient. The limitation seem to be in Git itself. The performance for this operation can't be improved much, since git doesn't cache the information we want in the tree, and also it doesn't store directly which files are changed by a commit, so we get this high order. I find it a bit strange there is no option to cache additional information in git to speed up this, as it should be a quite common use-case. I still think not showing this information (by default) for very large folders can be useful for these special cases. |
It's normal to warn the user if the diff will be too large, or there are too many files to diff. So for this operation too I think it's useful to hold down on the details if there is some indication that the operation will take too long (e.g repository size? some statistics?). |
I think it could be improved to add a cache system before git command. |
A cache system would be good for viewing the "HEAD" which most people use. If wouldn't help if someone wants to browse old history, unless some cache option is built into git for all trees (which I think would be outside scope of gitea). @guillep2k I thought it could be based on the number of entries (folders and files) in the folder being displayed. However a more true indication of the time needed will be the number of entries times the number of commits (in that folder), which you still can obtain with little effort. |
I have a prototype implementation of the git bloom filters which speed up browsing both HEAD and old history. I didn't pursue it further because I waited for an official git implementation. That said, I can revive it if anyone is brave enough to give it a try. |
This is still a problem with 1.9.4 (for the record). I get 6 seconds rendering for a mirror of qgis: Note that try.gitea.io gives a 500 (Internal Server Error) on the page I tried to setup for that: |
@filipnavara Do you have an insight in the chances that bloom filters get into git officially anytime soon? What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)? |
I don't know if there was any progress. There were few people who were interested in it but it didn't really move forward except for few experimental implementations at the end of the last year.
Azure GIT hosting does exactly that. It is perfectly doable and viable way short term, at small storage expense to duplicate some data structures. I will be happy to release my experimental implementation if someone wants to pick it up after me. I wrote all the code for reading and producing the bloom filters. The reading part was easy to integrate to Gitea. The writing part I did not integrate at all and that still needs to be done (manual index building and scheduled index building). I currently don't have any free engineering hours to dedicate to the project but I will be more than happy to help with it in any other way. |
@davidsvantesson I don't know what bloom filters are, but Gitea currently supports a considerable span of git versions, and there are plans to migrate to a pure golang implementation (I don't recall the library name). So, I wouldn't count much on implementing something that requires the latest git version. 😅 |
@guillep2k It is called go-git and I was one of the people who were doing the migration of Gitea code to use it. Coincidentally, I was also the person who added one of the latest git features to go-git (commit graph files) specifically to speed up Gitea file listing. I also implemented the bloom filters on top of go-git in file format that was compatible with one of the implementations discussed on the git mailing list... so I would say that it is very much possible to use latest git features if there is a use case for it and sufficient demand. |
@filipnavara Yes, go-git was it. What I meant is that we should not count on users having the latest git installed on their systems. We can certainly provide the feature if it's implemented inside Gitea itself. |
That's exactly what I do - both in Gitea and indirectly in go-git. The commitgraph file and the bloom filters are optional git indexes stored in the .git directory. Gitea/go-git can consume and generate them and new enough git can use them if they exists. |
It seems that some of the docker users aren't getting the git commitGraph gitconfig changes. |
Possibly linked/related to #490
I like to keep some mirrors of popular projects such as Rails on my Gitea server however whenever I go to view that repo, it can take 10 seconds plus (sometimes causing an nginx 502 timeout error) to load the page
The text was updated successfully, but these errors were encountered: