Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make repository invalidation easier on the system #3157

Closed
mamh2021 opened this issue Jun 1, 2020 · 16 comments
Closed

make repository invalidation easier on the system #3157

mamh2021 opened this issue Jun 1, 2020 · 16 comments
Assignees
Labels
enhancement webapp web application

Comments

@mamh2021
Copy link

mamh2021 commented Jun 1, 2020

Describe the bug
we have five big android code for index.
after restart tomcat the web page load take too long time
the ps command show that: there are many git process:

root     12983     0  0 04:12 ?        00:00:00 [git]
root     13029     1  0 04:12 ?        00:00:00 [git] <defunct>
root     13030     1  0 04:12 ?        00:00:00 [git] <defunct>
root     13037     1  0 04:12 ?        00:00:00 [git]
root     13038     1  0 04:12 ?        00:00:00 [git] <defunct>
root     13039     1  0 04:12 ?        00:00:00 [git] <defunct>
root     13095     1  0 04:12 ?        00:00:00 /usr/bin/git log --format=commit:%H%nDate:%at -n 1 tag_MP_20170522021208 --

tomcat run under the docker, the version is tomcat:9.0
the start docker comamnd is:

 docker run --restart always -d -e JAVA_OPTS='-Xmx60g' -p 80:8080 --name tomcat -v /work/buildsrv-ci:/work/buildsrv-ci -v /work/mirror:/work/mirror -v /opt/git:/opt/git -v /work/buildsrv-ci/web_root:/usr/local/tomcat/webapps   tomcat:9.0

the host server is ubuntu 18.04
the opengrok is 1.3.11 (* 77b6669 - 1.3.11 - Vladimir Kotal - (9 weeks ago))

Screenshots
1

3

the index command in indexer.sh like below:

#!/bin/bash -x
export OPENGROK_HOME=/home/buildsrv-ci/opengrok
export PATH=$PATH:$OPENGROK_HOME/opengrok-tools/bin
export WORK=/work/buildsrv-ci
export SITE=$1

opengrok-indexer \
  -J=-Xmx100g -J=-Djava.util.logging.config.file=$OPENGROK_HOME/etc/logging.properties \
  -a $OPENGROK_HOME/dist/opengrok-1.3.11/lib/opengrok.jar -- \
  -s $WORK/src_root/${SITE} -d $WORK/data_root/${SITE} -H -P -S -G \
  -W $WORK/data_root/${SITE}/configuration.xml -U http://androidxref.example.com/${SITE}
@vladak vladak added the question label Jun 1, 2020
@vladak
Copy link
Member

vladak commented Jun 1, 2020

All these git commands are spawned via HistoryGuru#invalidateRepositories() which goes through all repositories and checks their status. This sort of work is legitimate. This happens whenever new configuration is set. In this case the new configuration is set at the end of indexing.

It's a question whether the slow load of the index page is caused by some internal delays (e.g. caused by locking or because Git repository handling in OpenGrok spawns superfluos git commands) or simply because the machine has high CPU load - the 2nd screenshot hints towards the latter I think.

One way to work around this would be to avoid setting the configuration at the end of the indexing - assuming the projects/repositories do not change. Same thing for repository scanning (-S). This might lead to switching to different workflow - see https://github.com/oracle/opengrok/wiki/Per-project-management-and-workflow

@vladak
Copy link
Member

vladak commented Jun 1, 2020

For the Git repository handling inefficiency, this is actually a bug - see #2986

@mamh2021
Copy link
Author

mamh2021 commented Jun 1, 2020

thanks your reply very much.

@vladak
Copy link
Member

vladak commented Jun 8, 2020

Could you retry with 1.3.16 to see if the problem persists ?

@mamh2021
Copy link
Author

mamh2021 commented Sep 4, 2020

I see see #2986, thanks your reply.

I found that, when redeploy the war to tomcat, this method "buildTagList" in file GItRepository.java will be called.
this method will block the web thread?? and this method not run in a background thread??

in my git repo, threre are many tags, so the git tag command takes a lot of time .

@vladak
Copy link
Member

vladak commented Sep 4, 2020

When webapp starts, invalidateRepositories() is called that in turn calls buildTagList() via getRepository(). All the handling is sequential so yes, the webapp fill finish deploying only after this task finishes.

@mamh2021
Copy link
Author

mamh2021 commented Sep 4, 2020

I found the buildTagList() run in another thread, (final ExecutorService executor = Executors.newFixedThreadPool)
but this line "latch.await(); " will block the web thread. am I correct?

@mamh2021
Copy link
Author

mamh2021 commented Sep 4, 2020

I don't understand , why webappListerner call this method "buildTagList()",
We have used the offline indexer (python opengrok-indexer command) to create the data why we need buildTagList() when war load ?
Can I remove this method call ?

thanks your reply
Thank you very much. I've learned a lot from your reply.

@vladak
Copy link
Member

vladak commented Sep 4, 2020

That's a good question. Quite possibly the tag list is not needed when the webapp is starting. AFAIK it is only used in the history view of particular directory/file.

@mamh2021
Copy link
Author

mamh2021 commented Sep 4, 2020

maybe we can move all git time-consuming operations to the offline indexer, and save to local file?
then in web reload the file. loading a file should be quick.

@vladak
Copy link
Member

vladak commented Sep 4, 2020

The information about the repositories is already saved to the configuration by the indexer (when running with -W that is). The invalidation is done for the webapp to have fresh view of the repositories I think.

@ChristopheBordieu
Copy link

I have exactly same trouble with OG 1.5.11 and 1.9k projects and 30k Git repos. When restarting Tomcat webapp, it takes ages before to have access to web service.
Is this issue supposed to be fixed with #2986?

@vladak
Copy link
Member

vladak commented Mar 9, 2021

It might help a bit. #1113 might have more profound effect for your use case.

@mamh2021
Copy link
Author

I have exactly same trouble with OG 1.5.11 and 1.9k projects and 30k Git repos. When restarting Tomcat webapp, it takes ages before to have access to web service.
Is this issue supposed to be fixed with #2986?

disable tags."-G", "--assignTags"Maybe it will be better

@ChristopheBordieu
Copy link

Thanks @mamh2021 but it does not help.

@vladak
Copy link
Member

vladak commented Nov 22, 2021

I wonder how things changed for you in the recent versions.

There are 2 things that can be done here:

  • lower the concurrency level of repository invalidation
    • see the indexingParallelism property on https://github.com/oracle/opengrok/wiki/Webapp-configuration#configuration-tunables
      • it might make sense to introduce separate tunable for the invalidation with default value that would be a fraction of the number of available processors in the system (at the cost of the invalidation process taking longer)
        • given that invalidateRepositories() might be called from the indexer there could be a duality of the concurrency level properties
  • change RepositoryFactory#getRepository() to avoid calling buildTagList() when called from the invalidateRepositories() code path

@vladak vladak added enhancement webapp web application and removed question labels Nov 22, 2021
@vladak vladak changed the title after restart tomcat the web page load take too long time make repository invalidation easier on the system Nov 22, 2021
@vladak vladak self-assigned this Nov 22, 2021
vladak pushed a commit to vladak/OpenGrok that referenced this issue Nov 22, 2021
Also, do not rebuild repository tags in the web app.

fixes oracle#3157
@vladak vladak closed this as completed in 2123e36 Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement webapp web application
Projects
None yet
Development

No branches or pull requests

3 participants