Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow page loads with a large repo #491

Closed
deanpcmad opened this issue Dec 26, 2016 · 65 comments · Fixed by #10069
Closed

Slow page loads with a large repo #491

deanpcmad opened this issue Dec 26, 2016 · 65 comments · Fixed by #10069
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality
Milestone

Comments

@deanpcmad
Copy link

deanpcmad commented Dec 26, 2016

Possibly linked/related to #490

I like to keep some mirrors of popular projects such as Rails on my Gitea server however whenever I go to view that repo, it can take 10 seconds plus (sometimes causing an nginx 502 timeout error) to load the page

@lunny lunny added this to the 1.1.0 milestone Dec 27, 2016
@lunny lunny added the type/enhancement An improvement of existing functionality label Dec 27, 2016
@lunny
Copy link
Member

lunny commented Jan 14, 2017

I tested this on mac OS in a lowest power MacBook air, loading rails spent 3500ms ~ 4200ms. I think it's enough for v1.1.

@clandmeter
Copy link

@lunny could you please give alpinelinux aports a try?

Then try to browse the main directory.

It takes couple of minutes on a powerful server mainly because of git log....

I remember this issue has also been reported on gogs before but was never taken care of. Some have suggested to use a caching system. A simpler approach would be to fetch a directory list (like github does) and if needed a-sync fetch commit messages via javascript. Cgit just only shows the directory list which is pretty fast (if possible add this as an option to disable fetching of commit messages, if thats possible with current implementation).

@lunny
Copy link
Member

lunny commented Jan 18, 2017

I will try it. @clandmeter

@lunny
Copy link
Member

lunny commented Jan 18, 2017

Which page do you want to test? @clandmeter In my machine, main page is 1763ms and first release page is 6662ms .

@clandmeter
Copy link

@lunny can you check the main directory like this one at github:

https://github.com/alpinelinux/aports/tree/master/main

@clandmeter
Copy link

@lunny btw, im using 1.0.1 i believe the performance commits for tags page has landed after the 1.0.1, or in another branch.

@clandmeter
Copy link

@lunny I think #502 is related?

@lunny
Copy link
Member

lunny commented Jan 20, 2017

@clandmeter Yes, I tested in master. I think v1.0.1 maybe slower than master. Yes. it's related with #502

@clandmeter
Copy link

@lunny I tried master today both on Linux (Alpine Linux) and win10. Both crash at startup so i cannot verify if its faster.

@lunny
Copy link
Member

lunny commented Jan 20, 2017

Where is the crash log?

@clandmeter
Copy link

C:\Users\carlo\Desktop\gitea>gitea.exe web
2017/01/20 12:47:02 [W] Custom config 'C:/Users/carlo/Desktop/gitea/custom/conf/app.ini' not found, ignore this if you're running first time
2017/01/20 12:47:02 [T] Custom path: C:/Users/carlo/Desktop/gitea/custom
2017/01/20 12:47:02 [T] Log path: C:/Users/carlo/Desktop/gitea/log
2017/01/20 12:47:02 [I] Gitea v1.0.0+137-g1610b9f
2017/01/20 12:47:02 [I] Log Mode: Console(Trace)
2017/01/20 12:47:02 [I] Cache Service Enabled
2017/01/20 12:47:02 [I] Session Service Enabled
2017/01/20 12:47:02 [I] SQLite3 Supported
2017/01/20 12:47:02 [I] Run Mode: Development
panic: Macaron handler must be a callable function

goroutine 1 [running]:
panic(0xeec4e0, 0xc0434b7400)
        /usr/local/go/src/runtime/panic.go:500 +0x1af
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0xeec4e0, 0xc0434b73e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xbf
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc0434bfc80, 0x6, 0x8)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x54
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc04200f360, 0x100f580, 0x4, 0xc0434c4160, 0x1b, 0xc0434c2f90, 0x6, 0x8, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x417
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc04200f360, 0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x10)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x7c
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x1011b19, 0x6, 0xc0434c2f90, 0x3, 0x3, 0x3)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x63
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc0434bd100, 0xc0434a6bb8, 0x100f580, 0x4, 0xc0434a6cc8, 0x3, 0x3, 0xeec4e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x12e
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc0434bd100, 0xc0434a6cc8, 0x3, 0x3, 0xc0434b73e0)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x99
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6()
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d9
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x1020d48, 0xe, 0xc0434a6f58, 0xc0434b70c0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1()
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc42
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100e74a, 0x3, 0xc0434a71a0, 0xc04348cfc0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc0422c6580)
        /srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xdf
code.gitea.io/gitea/cmd.runWeb.func17()
        /srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x31
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc04200f360, 0x100f074, 0x4, 0xc0434a74a8, 0xc04348cfb0, 0x1, 0x1)
        /srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x112
code.gitea.io/gitea/cmd.runWeb(0xc042184140, 0x0, 0xc042184100)
        /srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1506
code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0xf07f80, 0x113f1c8, 0xc042184140, 0xc0421a2200, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xc0
code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x100edc8, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10318a9, 0x16, 0x0, ...)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcce
code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc04246e340, 0xc04203e3a0, 0x2, 0x2, 0x0, 0x0)
        /srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6aa
main.main()
        /srv/app/src/code.gitea.io/gitea/main.go:39 +0x35b

@drsect0r
Copy link
Contributor

drsect0r commented Jan 20, 2017

I am stopped by the same panic message as @clandmeter (I don't know if it is the same issue, I was trying to update my Gitea installation - running on Docker)

bash-4.3$ /app/gitea/gitea web 
2017/01/20 11:54:36 [T] Custom path: /data/gitea
2017/01/20 11:54:36 [T] Log path: /data/gitea/log
panic: Macaron handler must be a callable function

goroutine 1 [running]:
panic(0x7ffa36d24140, 0xc42152b720)
	/usr/lib/go/src/runtime/panic.go:500 +0x1a5
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandler(0x7ffa36d24140, 0xc42152b700)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:50 +0xba
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.validateHandlers(0xc421555400, 0x6, 0x8)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/macaron.go:58 +0x4f
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Handle(0xc4205cc460, 0x7ffa3661ce90, 0x4, 0xc42155e7c0, 0x1b, 0xc421568300, 0x6, 0x8, 0x7ffa37b93020)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:176 +0x412
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post(0xc4205cc460, 0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x10)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:210 +0x77
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Post-fm(0x7ffa3661f447, 0x6, 0xc421568300, 0x3, 0x3, 0x3)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x5e
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).route(0xc4215430c0, 0xc4214d6bb8, 0x7ffa3661ce90, 0x4, 0xc4214d6cc8, 0x3, 0x3, 0x7ffa36d24140)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:322 +0x129
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*ComboRouter).Post(0xc4215430c0, 0xc4214d6cc8, 0x3, 0x3, 0xc42152b700)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:335 +0x94
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1.6()
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:409 +0x4d4
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3662e69d, 0xe, 0xc4214d6f58, 0xc42152b3e0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/routers/api/v1.RegisterRoutes.func1()
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:417 +0xc3d
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c13f, 0x3, 0xc4214d71a0, 0xc4214912e0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/routers/api/v1.RegisterRoutes(0xc420473980)
	/srv/app/src/code.gitea.io/gitea/routers/api/v1/api.go:450 +0xda
code.gitea.io/gitea/cmd.runWeb.func17()
	/srv/app/src/code.gitea.io/gitea/cmd/web.go:609 +0x2c
code.gitea.io/gitea/vendor/gopkg.in/macaron%2ev1.(*Router).Group(0xc4205cc460, 0x7ffa3661c9d4, 0x4, 0xc4214d74a8, 0xc4214912d0, 0x1, 0x1)
	/srv/app/src/code.gitea.io/gitea/vendor/gopkg.in/macaron.v1/router.go:190 +0x10d
code.gitea.io/gitea/cmd.runWeb(0xc4201c17c0, 0x0, 0xc4201c1700)
	/srv/app/src/code.gitea.io/gitea/cmd/web.go:610 +0x1501
code.gitea.io/gitea/vendor/github.com/urfave/cli.HandleAction(0x7ffa36d3fcc0, 0x7ffa36e476e8, 0xc4201c17c0, 0xc420058d00, 0x0)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:471 +0xbb
code.gitea.io/gitea/vendor/github.com/urfave/cli.Command.Run(0x7ffa3661c730, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffa3663ebbf, 0x16, 0x0, ...)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/command.go:191 +0xcc9
code.gitea.io/gitea/vendor/github.com/urfave/cli.(*App).Run(0xc42024b520, 0xc42000c140, 0x2, 0x2, 0x0, 0x0)
	/srv/app/src/code.gitea.io/gitea/vendor/github.com/urfave/cli/app.go:241 +0x6a5
main.main()
	/srv/app/src/code.gitea.io/gitea/main.go:39 +0x356

@lunny
Copy link
Member

lunny commented Jan 20, 2017

resolved by #708

@lunny lunny closed this as completed Jan 20, 2017
@lunny lunny reopened this Jan 20, 2017
@clandmeter
Copy link

@lunny seems master branch is working again so I did some small tests:

  1. Release page: 1.0.0+dev Page: 19309ms Template: 9ms. It seems to list all 147 releases alpine has while it shows a pager at the bottom (seems paging is broken).
  2. Directory listing of aports/main takes ages (will never complete) so its unusable for projects with lots of items in its directories. Seems there is already a PR for this Improve the performance of tree listing #570

@lunny
Copy link
Member

lunny commented Jan 23, 2017

Yes. This issue should be fixed by #570 .

@lunny
Copy link
Member

lunny commented Feb 24, 2017

move this to v1.2 since #570 has been moved.

@lunny lunny modified the milestones: 1.2.0, 1.1.0 Feb 24, 2017
@lunny lunny modified the milestones: 1.x.x, 1.2.0 Apr 20, 2017
@clandmeter
Copy link

@lunny any progress in this area?

Im still getting very slow loads on large directory contents:
Gitea Version: d545e32 Page: 418155ms Template: 11903ms

https://try.gitea.io/clandmeter/aports/src/branch/master/community

Would it be possible to have a pager or disable the loading of commit history?

@lunny
Copy link
Member

lunny commented Nov 3, 2017

For github it will only show the first 1000 files.

@dfredell
Copy link

Oh my @filipnavara Awesome job on the performance improvements. I pulled your perf-read branch and built it.
I have a repo with 25,000 files in one folder. The previous gitea web ui would take > 3 hours to load, but with your branch it loaded in 19s!
I would love to see this change make it into the master line.

@filipnavara
Copy link
Contributor

@dfredell Unfortunately I am busy and don't have time to upstream it. However I do update the branch every now and then to track upstream changes. Once #6364 gets merged I will do it again and probably open a PR to start the discussion.

@LukeOwlclaw
Copy link

#6364 was merged yesterday! 😄

@jchook
Copy link

jchook commented Sep 5, 2019

This issue seems to persist on the demo site (e.g. for the golang repo), taking 4.2s to respond to an HTTP GET.

During evaluation, this kind of problem might cause someone to use cgit instead.

@davidsvantesson
Copy link
Contributor

I have a repo with a folder with more than 2000 files. This takes ~25 seconds to load (not production site), of which 24 seconds are spent in getLastCommitForPaths (run from recent Gitea master branch).

In addition to any performance improvements possible, maybe a new option could be introduced to display only file names (without latest commit info) if a folder contains more than x entries (folders and files). That way very big folders can still be shown quickly but if you want to see commit details/history you need to enter the specific file.

@filipnavara
Copy link
Contributor

filipnavara commented Sep 13, 2019

@davidsvantesson You can speed it up a bit by building commit-graph file (git commit-graph write). I would be interested in how much it helps for your repository.

@davidsvantesson
Copy link
Contributor

davidsvantesson commented Sep 13, 2019

@filipnavara That is very interesting, but I do not see any change in performance for listing repo files in Gitea. Maybe Gitea doesn't run operations where it benefits from it?

Edit: That is strange, because I have the code of #7314, but doesn't seem to improve my performance. I will do some more investigation into it.

@davidsvantesson
Copy link
Contributor

davidsvantesson commented Sep 14, 2019

I think the problem is that getLastCommitForPaths has to traverse (remaining) paths for all commits. If we take an extreme case where 2000 files are added in the initial repo commit and then 10000 commits are made not doing any changes in the folder. Then it will have to loop 20.000.000 times (2000*10000). In a more realistic case where the files are added one by one it will still be about half of that.

It would be interesting to change to use something like git log --max-count=1 on each file to see how it affects the performance.

@filipnavara
Copy link
Contributor

Yes, that is the pathological case and there is no way around it unless you introduce some new cache or statistical structure (bloom filters) to speed this up. The algorithm in getLastCommitForPaths goes through the history only once and thus saves a lot of git object accesses compared to running git log on each file.

@davidsvantesson
Copy link
Contributor

@filipnavara A simple command line git operation made it clear Gitea is already very efficient. The limitation seem to be in Git itself. The performance for this operation can't be improved much, since git doesn't cache the information we want in the tree, and also it doesn't store directly which files are changed by a commit, so we get this high order. I find it a bit strange there is no option to cache additional information in git to speed up this, as it should be a quite common use-case.

I still think not showing this information (by default) for very large folders can be useful for these special cases.

@guillep2k
Copy link
Member

It's normal to warn the user if the diff will be too large, or there are too many files to diff. So for this operation too I think it's useful to hold down on the details if there is some indication that the operation will take too long (e.g repository size? some statistics?).

@lunny
Copy link
Member

lunny commented Sep 15, 2019

I think it could be improved to add a cache system before git command.

@davidsvantesson
Copy link
Contributor

A cache system would be good for viewing the "HEAD" which most people use. If wouldn't help if someone wants to browse old history, unless some cache option is built into git for all trees (which I think would be outside scope of gitea).

@guillep2k I thought it could be based on the number of entries (folders and files) in the folder being displayed. However a more true indication of the time needed will be the number of entries times the number of commits (in that folder), which you still can obtain with little effort.

@filipnavara
Copy link
Contributor

filipnavara commented Sep 15, 2019

I have a prototype implementation of the git bloom filters which speed up browsing both HEAD and old history. I didn't pursue it further because I waited for an official git implementation. That said, I can revive it if anyone is brave enough to give it a try.

@strk
Copy link
Member

strk commented Oct 22, 2019

This is still a problem with 1.9.4 (for the record). I get 6 seconds rendering for a mirror of qgis:
https://dev.git.osgeo.org/gitea/qgis/qgis

Note that try.gitea.io gives a 500 (Internal Server Error) on the page I tried to setup for that:
https://try.gitea.io/strk/QGIS

@davidsvantesson
Copy link
Contributor

@filipnavara Do you have an insight in the chances that bloom filters get into git officially anytime soon?

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

@filipnavara
Copy link
Contributor

Do you have an insight in the chances that bloom filters get into git officially anytime soon?

I don't know if there was any progress. There were few people who were interested in it but it didn't really move forward except for few experimental implementations at the end of the last year.

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

Azure GIT hosting does exactly that. It is perfectly doable and viable way short term, at small storage expense to duplicate some data structures. I will be happy to release my experimental implementation if someone wants to pick it up after me. I wrote all the code for reading and producing the bloom filters. The reading part was easy to integrate to Gitea. The writing part I did not integrate at all and that still needs to be done (manual index building and scheduled index building). I currently don't have any free engineering hours to dedicate to the project but I will be more than happy to help with it in any other way.

@guillep2k
Copy link
Member

What problems could it be for Gitea to use some own/unofficial implementation of bloom filters (risks, effort etc)?

@davidsvantesson I don't know what bloom filters are, but Gitea currently supports a considerable span of git versions, and there are plans to migrate to a pure golang implementation (I don't recall the library name). So, I wouldn't count much on implementing something that requires the latest git version. 😅

@filipnavara
Copy link
Contributor

@guillep2k It is called go-git and I was one of the people who were doing the migration of Gitea code to use it. Coincidentally, I was also the person who added one of the latest git features to go-git (commit graph files) specifically to speed up Gitea file listing. I also implemented the bloom filters on top of go-git in file format that was compatible with one of the implementations discussed on the git mailing list... so I would say that it is very much possible to use latest git features if there is a use case for it and sufficient demand.

@guillep2k
Copy link
Member

@filipnavara Yes, go-git was it. What I meant is that we should not count on users having the latest git installed on their systems. We can certainly provide the feature if it's implemented inside Gitea itself.

@filipnavara
Copy link
Contributor

We can certainly provide the feature if it's implemented inside Gitea itself.

That's exactly what I do - both in Gitea and indirectly in go-git. The commitgraph file and the bloom filters are optional git indexes stored in the .git directory. Gitea/go-git can consume and generate them and new enough git can use them if they exists.

@zeripath
Copy link
Contributor

It seems that some of the docker users aren't getting the git commitGraph gitconfig changes.

@lafriks lafriks modified the milestones: 1.x.x, 1.12.0 Feb 1, 2020
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.